Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: Merging on district (by wave)

    Good day

    I am conducting research on the effects of energy poverty on human development outcomes in South Africa. I am using a panel dataset: all five waves of the National Income Dynamics Survey (South Africa).
    As pointed out by literature, electricity tariff/ price may serve as an appropriate instrumental variable for energy poverty.

    NIDS has information on electricity expenditure, but not on price.
    I have had to compile my own dataset on South Africa's average electricity price for the wave year by local district. I successfully did so and imported the excel sheet into Stata to create a dataset.
    My issue comes into play now: I want to merge the electricity price data with the NIDS panel so that each district has the price of electricity for that year.

    Here is an example of the code that I have tried to use to make a unique district-wave identifier (having also done it manually):
    Code:
    egen dis_wave = concat(district wave)
    Code:
     merge 1:1 dis_wave using "NERSA_33.dta", force
    However, Stata gives the following error message:
    variable dis_wave does not uniquely identify observations in the master data
    I am really struggling to find any resources that point to an adequate way to achieve this. Your help would be greatly appreciated.

    Here is an example of the dataset(s):

    NIDS:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(pid hhid) int district byte wave str4 dis_wave
    410155 210164  24 2 "242" 
    313284 214163  24 2 "242" 
    408391 511458  24 5 "245" 
    317120 502738  26 5 "265" 
    320846 301950 275 3 "2753"
    492603 502472  38 5 "385" 
    381096 300138  44 3 "443" 
    412211 410930   6 4 "64"  
    310042 121566 774 1 "7741"
    403343 113033  83 1 "831" 
    end
    label values district w5_dc2001
    label def w5_dc2001 6 "6. Namakwa District Municipality", modify
    label def w5_dc2001 24 "24. Umzinyathi District Municipality", modify
    label def w5_dc2001 26 "26. Zululand District Municipality", modify
    label def w5_dc2001 38 "38. Central District Municipality", modify
    label def w5_dc2001 44 "44. Alfred Nzo District Municipality", modify
    label def w5_dc2001 83 "83. Sekhukhune Cross Boundary District Municipality", modify
    label def w5_dc2001 275 "275. Nelson Mandela Metropolitan Municipality", modify
    label def w5_dc2001 774 "774. City Of Johannesburg Metropolitan Municipality", modify
    label values wave WA
    label def WA 1 "wave 1", modify
    label def WA 2 "wave 2", modify
    label def WA 3 "wave 3", modify
    label def WA 4 "wave 4", modify
    label def WA 5 "wave 5", modify
    ELECTRICITY TARIFF DATA:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(pid hhid) int district byte wave str4 dis_wave
    410155 210164  24 2 "242" 
    313284 214163  24 2 "242" 
    408391 511458  24 5 "245" 
    317120 502738  26 5 "265" 
    320846 301950 275 3 "2753"
    492603 502472  38 5 "385" 
    381096 300138  44 3 "443" 
    412211 410930   6 4 "64"  
    310042 121566 774 1 "7741"
    403343 113033  83 1 "831" 
    end
    label values district w5_dc2001
    label def w5_dc2001 6 "6. Namakwa District Municipality", modify
    label def w5_dc2001 24 "24. Umzinyathi District Municipality", modify
    label def w5_dc2001 26 "26. Zululand District Municipality", modify
    label def w5_dc2001 38 "38. Central District Municipality", modify
    label def w5_dc2001 44 "44. Alfred Nzo District Municipality", modify
    label def w5_dc2001 83 "83. Sekhukhune Cross Boundary District Municipality", modify
    label def w5_dc2001 275 "275. Nelson Mandela Metropolitan Municipality", modify
    label def w5_dc2001 774 "774. City Of Johannesburg Metropolitan Municipality", modify
    label values wave WA
    label def WA 1 "wave 1", modify
    label def WA 2 "wave 2", modify
    label def WA 3 "wave 3", modify
    label def WA 4 "wave 4", modify
    label def WA 5 "wave 5", modify

  • #2
    Sohie:
    I was successful in merging Sophie_1 and Sophie_2 via :
    Code:
    .. use "C:\Users\user\Desktop\Sophie_1.dta"
    
    . merge 1:1 pid hhid wave using "C:\Users\user\Desktop\Sophie_2.dta"
    (label w5_dc2001 already defined)
    (label WA already defined)
    
        Result                           # of obs.
        -----------------------------------------
        not matched                             0
        matched                                10  (_merge==3)
        -----------------------------------------
    
    .
    I'm not sure whether this is what you're after, though.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      it is clear from your examples that neither district nor wave nor their combination gives you the distinct id that you seem to want (but don't see the electricity price data in your electricity tariff data anyway); however, at least in your examples, pid and hhid are distinct identifiers; if that is not true in the larger data, maybe it's true that the combination of pid (and/or hhid) along with district and/or wave will work - but there's really not enough to go here - what I would have expected is that the NIDS data might use pid/hhid as a distinct id while the electricity tariff data would use district/wave and you would do an m:1 or 1:m merge - so I think you need to give us more info

      added: crossed with Carlo's answer

      Comment

      Working...
      X