Dear Stata users
I am working with a household level panel dataset with three rounds (Bangladesh Intra-Household Survey conducted in 2011/12, 2015, and 2018/19) but the households split over time and are renamed. This is a problem because I want to analyse changes in poultry meat (cap_poultrymeat_adj) consumption within a household over time. I am looking for a way to rename the household ID (a01) in the first round to match those in the second round. My guess is that I will have to duplicate the original households and rename them to match the later rounds ID.
In the first round they are whole numbers, but in later rounds they take on .1 ... .n if they split. So for example, if a01 in round 1 is 3, round 2 is also 3, but by round 3 it splits into a01: 3.1 and 3.2. So i will duplicate rounds 2 and 1 and rename them 3.1 and 3.2 so that for each year there is a01 3.1 and 3.2 to compare the same household across time. I have tagged split households using the following code to identify which have split: (split = 1; remain the same = . )
Does anyone have any advice on how to now duplicate the relevant households? Alternatively, if anyone has worked with panel data with split households and has better advice for me I would greatly appreciate this.
Thank you very much for the advice.
All the best,
Emma
Here is an example of my dataset:
I am working with a household level panel dataset with three rounds (Bangladesh Intra-Household Survey conducted in 2011/12, 2015, and 2018/19) but the households split over time and are renamed. This is a problem because I want to analyse changes in poultry meat (cap_poultrymeat_adj) consumption within a household over time. I am looking for a way to rename the household ID (a01) in the first round to match those in the second round. My guess is that I will have to duplicate the original households and rename them to match the later rounds ID.
In the first round they are whole numbers, but in later rounds they take on .1 ... .n if they split. So for example, if a01 in round 1 is 3, round 2 is also 3, but by round 3 it splits into a01: 3.1 and 3.2. So i will duplicate rounds 2 and 1 and rename them 3.1 and 3.2 so that for each year there is a01 3.1 and 3.2 to compare the same household across time. I have tagged split households using the following code to identify which have split: (split = 1; remain the same = . )
Code:
gen split = 1 if mod(a01,1) > 0
Thank you very much for the advice.
All the best,
Emma
Here is an example of my dataset:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double a01 float(round split cap_all_adj cap_poultrymeat_adj) 1 1 . 1180.3214 0 1 2 . 1298.893 0 1 3 . 2455.658 0 2 1 . 673.5714 0 2 2 . 1254.357 71.42857 2 3 . 917.2857 0 3 1 . 443.2143 0 3 2 . 517.3393 0 3.1 3 1 1452.7858 0 3.2 3 1 995.5045 0 4 1 . 686.6071 0 4 2 . 1316 42.85714 4 3 . 873.9429 0 5 1 . 1369.2858 0 5 2 . 2096.0178 35.714287 5 3 . 1364.607 0 6 1 . 754.5223 0 6 2 . 686.869 0 6 3 . 879.6905 0 7 1 . 930.4762 23.809525 7 2 . 1800.4108 0 7 3 . 1528.6285 71.42857 8 1 . 1105.9504 0 end a01: unique panel identifier: household round "Panel round. 1: 2011/12; 2: 2015; 3: 2018/19" split: "Household split over time" cap_all_adj "Total food consumption g/capita/day" poultrymeat_adj "Total poultrymeat consumption g/capita/day"
Comment