Generate Dyadic/hierarchical data set

Babette Zehl

Join Date: Apr 2018

Posts: 4
#1

Generate Dyadic/hierarchical data set

25 Jun 2018, 08:05

Dear Statalist- Users!
I am working with Data from five waves of the Survey of Health, Ageing and Retirement in Europe (SHARE) and I would appreciate your help in regard to creating a dyadic data set.

Until now, I managed to merge all relevant questionnaires modules from the five waves into one big dataset and created an ID called "mergeid_n" to identify the respondents. In my case, respondents are european grandparents who reported in at least two waves, that they were providing grandchildcare and how often they did (daily, monthly, about every week, etc.). After merging the waves, I set the data to be a panel data set with the "xtset" command. Now I have a dataset in long format, showing me all the relevant variables I am interested in. To this point, everything went well. But here is my problem: some of the respondents did not only provide grandchild care to one, but to two or more kids in one observation/wave and gave information on how intens this grandchild care for each specific child was. These Information are stored in variables - e.g. sp015_1 indicates if grandchild care was provided to child1 [0 = no, 1 = yes], sp015_2 if it was provided to child 2, etc. (the same goes with grandchildcare intensity: sp016_1 = intensity of grandchild care provided to child 1, sp016_2 = intensity of grandchild care provided to child 2, and so on ...). Now I would like to make use of this information and create a dyadic data set. Just to make it clearer - until now I have a longitudinal dataset that looks like this:

wave mergeid_n sp016_1 sp016_2 sp016_3 ch005_1 (sex child1) ch005_2 (sex child 2) ...
1 5 daily monthly about every week male female
2 5 monthly . . male .
3 5 . daily . . female
4 5 daily daily daily male female

Now I would like to have a data set which looks like this:
wave mergeid_n intensity of grandchild care sex
1 5 sp016_1 ch005_1 (sex child1= male)
1 5 sp016_2 ch005_2 (sex child 2= female)
1 5 sp016_3
2 5
2 5
...
unfortunately I have no Idea how to get there, since I don't know how to expand a dataset that already is in long format for even more rows.
I really hope you understand my problem and can help to solve it!
Thank you for your answers!
Babette
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

25 Jun 2018, 09:09

Well, your data set is not in fully long layout. It is long with respect to waves, but it is wide with respect to grandchildren. It is a hybrid layout. You can get what you want with:

Code:

reshape long sp016 ch005 sex_child, i(mergeid_n) j(grandchild_num)

Note: not tested because you did not post usable example data. While your example data is visually descriptive, it cannot be real Stata data because "sex child1", which contains a blank, is not a legal Stata variable name. It is more helpful to provide real example data from a real Stata data set using the -dataex- command.

In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment

Babette Zehl

Join Date: Apr 2018
Posts: 4

27 Jun 2018, 05:33

Thanks for your quick reply, Mr. Schechter! I listened to your advice and bellow you can find an example of my data. It contains some selected characteristics of the respondents' child (like sex or the number of children the respondents' child has). I hope you now get a better understanding of my dataset!

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(mergeid_n wave) byte(ch001 ch0051 ch0052 ch0053 ch0191 ch0192 ch0193 sp0161 sp0162 sp0163)
  5 4 6 2 1 2 0 0 2 . . .
  5 5 6 2 1 2 . . . . . .
  5 6 6 2 2 2 2 3 3 4 4 .
 20 4 2 1 2 . 2 2 . 1 . .
 20 5 2 1 2 . . . . 1 . .
 20 6 . . . . . . . . . .
 64 4 4 2 1 2 2 2 2 . 2 .
 64 5 4 2 1 2 . . . . 1 .
 64 6 . . . . . . . . . .
105 4 4 2 2 1 2 0 0 3 . .
105 5 . . . . . . . . . .
105 6 3 2 2 1 2 0 0 2 . .
129 1 3 2 2 2 1 0 0 . . .
129 2 3 2 2 2 2 1 0 2 2 .
129 4 3 2 2 2 . . . 1 2 3
129 5 3 2 2 2 . . . 3 2 2
129 6 . . . . . . . . . .
140 1 1 1 . . . . . 2 . .
140 2 1 1 . . 2 . . 2 . .
140 4 1 1 . . . . . 2 . .
end
label values ch001 dkrf
label values ch0191 dkrf
label values ch0192 dkrf
label values ch0193 dkrf
label values ch0051 gender
label values ch0052 gender
label values ch0053 gender
label def gender 1 "Male", modify
label def gender 2 "Female", modify
label values sp0161 howoftensp
label values sp0162 howoftensp
label values sp0163 howoftensp
label def howoftensp 1 "About daily", modify
label def howoftensp 2 "About every week", modify
label def howoftensp 3 "About every month", modify
label def howoftensp 4 "Less often", modify
label var wave "wave"
label var ch001 "Number of children"
label var ch0051 "Child 1 gender"
label var ch0052 "Child 2 gender"
label var ch0053 "Child 3 gender"
label var ch0191 "Child 1 number of children"
label var ch0192 "Child 2 number of children"
label var ch0193 "Child 3 number of children"
label var sp0161 "How often did you look after child of child 1"
label var sp0162 "How often did you look after child of child 2"
label var sp0163 "How often did you look after child of child 3"

Furthermore, I typed in the command you proposed (reshape long sp016 ch005, i(mergeid_n) j(grandchild_num)) - but unfortunately I get an error messages saying "variable id does not uniquely identify the observations - Your data are currently wide. You are performing a reshape long. You specified i(mergeid_n) and j(grandchild_num). In the current wide form, variable mergeid_n should uniquely identify the observations." Maybe you have an idea where the problem could be hidden? Thanks again for your help!

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

27 Jun 2018, 14:49

Yes, seeing the data I see that what is needed is:

Code:

reshape long sp016 ch005 sex_child, i(mergeid_n wave) j(grandchild_num)

That works with your example. If it does not do the trick for your full data, please repost with a data example that exhibits a problem.
Comment

Announcement