confused about restructuring data: reshaping? stacking? something else?

david parsley

Join Date: Sep 2020
Posts: 14

confused about restructuring data: reshaping? stacking? something else?

25 Jul 2021, 14:44

Hi, I am using Stata 16.1 on Windows 10. I have attached two datasets: (1) a simple example of the structure of my input data as it is currently structured, and (2) the way I would like to structure it. I have tried using reshape and stack with no success.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(obs attribute date id1 id2 id3 id4 id5)
 1 1 2001.1    . -.87 -.02 -.92 -.49
 2 1 2001.2  -.4 -.85 -.05 -.08 -.86
 3 1 2001.3 -.22  -.5 -.13 -.76 -.32
 4 1 2001.4 -.53 -.15 -.92 -.37 -.65
 5 2 2002.1 -.72 -.18 -.12 -.99 -.24
 6 2 2002.2 -.22 -.45 -.03 -.81 -.27
 7 2 2002.3 -.19 -.45 -.12    . -.24
 8 2 2002.4 -.12 -.56 -.96 -.08 -.22
 9 3 2003.1 -.46 -.19 -.85 -.81 -.27
10 3 2003.2 -.64 -.13 -.68 -.86 -.43
11 3 2003.3 -.55 -.93  -.7 -.02 -.14
12 3 2003.4 -.57 -.35 -.72 -.33 -.37
end

and (2) the desired structure I'm trying to get

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(obs1 panelobs attribute1 id date1 x)
 1  1 1 1 2001.1    .
 2  2 1 1 2001.2  -.4
 3  3 1 1 2001.3 -.22
 4  4 1 1 2001.4 -.53
 5  5 1 2 2001.1 -.87
 6  6 1 2 2001.2 -.85
 7  7 1 2 2001.3  -.5
 8  8 1 2 2001.4 -.15
 9  9 1 3 2001.1 -.02
10 10 1 3 2001.2 -.05
11 11 1 3 2001.3 -.13
12 12 1 3 2001.4 -.92
13 13 1 4 2001.1 -.92
14 14 1 4 2001.2 -.08
15 15 1 4 2001.3 -.76
16 16 1 4 2001.4 -.37
17 17 1 5 2001.1 -.49
18 18 1 5 2001.2 -.86
19 19 1 5 2001.3 -.32
20 20 1 5 2001.4 -.65
21  1 2 1 2002.1 -.72
22  2 2 1 2002.2 -.22
23  3 2 1 2002.3 -.19
24  4 2 1 2002.4 -.12
25  5 2 2 2002.1 -.18
26  6 2 2 2002.2 -.45
27  7 2 2 2002.3 -.45
28  8 2 2 2002.4 -.56
29  9 2 3 2002.1 -.12
30 10 2 3 2002.2 -.03
31 11 2 3 2002.3 -.12
32 12 2 3 2002.4 -.96
33 13 2 4 2002.1 -.99
33 13 2 4 2002.1 -.99
35 15 2 4 2002.3    .
36 16 2 4 2002.4 -.08
37 17 2 5 2002.1 -.24
38 18 2 5 2002.2 -.27
39 19 2 5 2002.3 -.24
40 20 2 5 2002.4 -.22
41  1 3 1 2003.1 -.46
42  2 3 1 2003.2 -.64
43  3 3 1 2003.3 -.55
44  4 3 1 2003.4 -.57
45  5 3 2 2003.1 -.19
46  6 3 2 2003.2 -.13
47  7 3 2 2003.3 -.93
48  8 3 2 2003.4 -.35
49  9 3 3 2003.1 -.85
50 10 3 3 2003.2 -.68
51 11 3 3 2003.3  -.7
52 12 3 3 2003.4 -.72
53 13 3 4 2003.1 -.81
54 14 3 4 2003.2 -.86
55 15 3 4 2003.3 -.02
56 16 3 4 2003.4 -.33
57 17 3 5 2003.1 -.27
58 18 3 5 2003.2 -.43
59 19 3 5 2003.3 -.14
60 20 3 5 2003.4 -.37
end

Thanks for your help or suggestions.
David.

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30148
#2

25 Jul 2021, 16:22

At its heart, this is a -reshape-. But you also did some dancing with the variable names, and created new variables, obs1 and panelobs that do not derive directly from the variables in the starting data, and changed the order of the variables. But those are not hard to implement.

Code:

rename id* x* reshape long x, i(attribute date) j(id) rename (attribute date) =1 drop obs sort attribute1 id date1 gen long obs1 = _n by attribute1 (obs1), sort: gen panelobs = _n order obs1 panelobs attribute1 id date1 x
1 like
Comment
david parsley

Join Date: Sep 2020

Posts: 14
#3

25 Jul 2021, 17:01

Thanks. Yes, I wanted to make the panel structure explicit - and originally, I was trying to save these created variables into a single data set , which is why I needed to have new names. I'll have a look.
Comment

Announcement

confused about restructuring data: reshaping? stacking? something else?

Comment

Comment