Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: adjusting for split households by duplicating and renaming original households ID

    Dear Stata users

    I am working with a household level panel dataset with three rounds (Bangladesh Intra-Household Survey conducted in 2011/12, 2015, and 2018/19) but the households split over time and are renamed. This is a problem because I want to analyse changes in poultry meat (cap_poultrymeat_adj) consumption within a household over time. I am looking for a way to rename the household ID (a01) in the first round to match those in the second round. My guess is that I will have to duplicate the original households and rename them to match the later rounds ID.

    In the first round they are whole numbers, but in later rounds they take on .1 ... .n if they split. So for example, if a01 in round 1 is 3, round 2 is also 3, but by round 3 it splits into a01: 3.1 and 3.2. So i will duplicate rounds 2 and 1 and rename them 3.1 and 3.2 so that for each year there is a01 3.1 and 3.2 to compare the same household across time. I have tagged split households using the following code to identify which have split: (split = 1; remain the same = . )

    Code:
     gen split = 1 if mod(a01,1) > 0
    Does anyone have any advice on how to now duplicate the relevant households? Alternatively, if anyone has worked with panel data with split households and has better advice for me I would greatly appreciate this.

    Thank you very much for the advice.
    All the best,
    Emma

    Here is an example of my dataset:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double
    a01 float(round split cap_all_adj cap_poultrymeat_adj)
      1 1 . 1180.3214         0
      1 2 .  1298.893         0
      1 3 .  2455.658         0
      2 1 .  673.5714         0
      2 2 .  1254.357  71.42857
      2 3 .  917.2857         0
      3 1 .  443.2143         0
      3 2 .  517.3393         0
    3.1 3 1 1452.7858         0
    3.2 3 1  995.5045         0
      4 1 .  686.6071         0
      4 2 .      1316  42.85714
      4 3 .  873.9429         0
      5 1 . 1369.2858         0
      5 2 . 2096.0178 35.714287
      5 3 .  1364.607         0
      6 1 .  754.5223         0
      6 2 .   686.869         0
      6 3 .  879.6905         0
      7 1 .  930.4762 23.809525
      7 2 . 1800.4108         0
      7 3 . 1528.6285  71.42857
      8 1 . 1105.9504         0
    end
    a01: unique panel identifier: household
    round "Panel round. 1: 2011/12; 2: 2015; 3: 2018/19"
    split: "Household split over time"
    cap_all_adj "Total food consumption g/capita/day"
    poultrymeat_adj "Total poultrymeat consumption g/capita/day"

  • #2
    I think I may have solved this issue now, thank you. Maybe someone could advise me on a shortcut or give me better advice for split households and panel anlysis.
    Thank you all,
    Emma

    Code:
     gen a01_1 = a01
    replace a01_1 = round(a01_1)
    egen split_1 = count(split), by(a01_1)
    replace split_1 =1 if split_1 > 0
    
    expand 2 if split ==. & split_1 == 1, gen(dupindicator)
    br a01 a01_1 split split_1 round dupindicator 
    
    replace a01_1 = a01+0.1 if split == . & dupindicator ==1 & split_1 ==1
    replace a01_1 = a01+0.2 if split == . & dupindicator ==0 &  split_1 ==1
    replace a01_1 = a01 if split == 1

    Comment

    Working...
    X