Hi all,
I have a question regarding the merging of two cross sectional datasets. I am using a survey which was carried out in 2006 and again in 2012. Approximately 50% of those interviewed in the first year were reinterviewed in the second year the survey was done. I am planning to do some difference-in-difference analysis to basically compare the individual in 06 to himself in 12 and how changes in that person’s resources affects my dependent variable (a score predicted using factor analysis). However, I am not sure if I am correctly merging the data to create a panel.
In the cross section data, one variable uniquely identify each person: the individual ID.
These ID's are not the same in the 2006 and 2012 cross sections. In the merged dataset I have two variables: indid and indid_06. If a person has a value for both, then they were interviewed in 2012 and 2006. If they don’t have a value for indid_06 then they were only interviewed in 2012. Please see below using dataex:
The command I used was:
With the master data being the 12 cross section. My question is, is there a way to check that I have merged the data correctly? Am I wrong to be using append instead of the merge command?
I am relatively new to stata so any help is appreciated.
Thanks.
I have a question regarding the merging of two cross sectional datasets. I am using a survey which was carried out in 2006 and again in 2012. Approximately 50% of those interviewed in the first year were reinterviewed in the second year the survey was done. I am planning to do some difference-in-difference analysis to basically compare the individual in 06 to himself in 12 and how changes in that person’s resources affects my dependent variable (a score predicted using factor analysis). However, I am not sure if I am correctly merging the data to create a panel.
In the cross section data, one variable uniquely identify each person: the individual ID.
These ID's are not the same in the 2006 and 2012 cross sections. In the merged dataset I have two variables: indid and indid_06. If a person has a value for both, then they were interviewed in 2012 and 2006. If they don’t have a value for indid_06 then they were only interviewed in 2012. Please see below using dataex:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double indid long indid_06 float year byte(urban sex) 12010409102 . 2012 0 2 12010550102 601072302 2012 0 2 12011172102 621546503 2012 0 2 12011173102 601016704 2012 0 2 12011025102 601089002 2012 0 2 12011176102 601024503 2012 0 2 12011070102 . 2012 0 2 12011180102 624655703 2012 0 2 12011184102 . 2012 0 2 12010046102 601079402 2012 0 2 12010886102 . 2012 0 2 12010755102 . 2012 0 2 12010882102 601103802 2012 0 2 12010516102 601073202 2012 0 2 12011127102 612233903 2012 0 2 12010623102 601055605 2012 0 2 12010481102 601100102 2012 0 2 12010830102 601013702 2012 0 2 12010567102 . 2012 0 2 12010353102 601033103 2012 0 2 12011008102 601061602 2012 0 2 12010665102 . 2012 0 2 12010329102 601105602 2012 0 2 12010870102 . 2012 0 2 12010868102 . 2012 0 2 12011171102 622588912 2012 0 2 12011032102 601004802 2012 0 2 12010477102 601073702 2012 0 2 12011035102 601004603 2012 0 2 12010936102 . 2012 0 2 12011157102 601108503 2012 0 2 12010473102 601027302 2012 0 2 12010855102 601011902 2012 0 2 12011153102 601012503 2012 0 2 12011040102 601088102 2012 0 2 12010591102 601071502 2012 0 2 12011143102 . 2012 0 2 12010031102 621569503 2012 0 2 12010443102 601101202 2012 0 2 12010179102 601056502 2012 0 2 12010445102 601101002 2012 0 2 12011140102 . 2012 0 2 12010896102 . 2012 0 2 12011129102 601012406 2012 0 2 12010721102 601095102 2012 0 2 12010931102 . 2012 0 2 12011124102 601002103 2012 0 2 12011046102 601003802 2012 0 2 12010606102 601018902 2012 0 2 12011110102 601085001 2012 0 2 12010933102 . 2012 0 2 12011092102 601086602 2012 0 2 12010461102 . 2012 0 2 12010492102 601098402 2012 0 2 12011083102 601002502 2012 0 2 12011076102 . 2012 0 2 12010319102 601107202 2012 0 2 12010866102 601012404 2012 0 2 12010614102 601070402 2012 0 2 12010750102 . 2012 0 2 12011112102 601084502 2012 0 2 12010556102 . 2012 0 2 12010616102 601070202 2012 0 2 12010414102 601030402 2012 0 2 12010456102 . 2012 0 2 12010865102 . 2012 0 2 12010208102 601041802 2012 0 2 12010619102 601019304 2012 0 2 12010527102 601023802 2012 0 2 12010280102 601110704 2012 0 2 12010279102 601110706 2012 0 2 12010277102 601075801 2012 0 2 12010621102 . 2012 0 2 12010125102 601112702 2012 0 2 12010416102 601030202 2012 0 2 12011037102 601088302 2012 0 2 12010014102 601054402 2012 0 2 12010553102 601097205 2012 0 2 12010088102 601047802 2012 0 2 12011017102 . 2012 0 2 12010565102 . 2012 0 2 12010108102 . 2012 0 2 12010004102 . 2012 0 2 12010563102 601097202 2012 0 2 12010667102 . 2012 0 2 12010971102 . 2012 0 2 12010961102 601062702 2012 0 2 12010105102 601046502 2012 0 2 12010656102 . 2012 0 2 12010564102 601097203 2012 0 2 12010862102 601065402 2012 0 2 12010784102 . 2012 0 2 12011156102 601099405 2012 0 2 12010169102 601044203 2012 0 2 12010482102 601100002 2012 0 2 12010261102 601037302 2012 0 2 12010736102 . 2012 0 2 12010985102 601090602 2012 0 2 12010197102 . 2012 0 2 12010807102 . 2012 0 2 end label values urban urban label values sex q101 label def q101 2 "female", modify
The command I used was:
Code:
append using "C:\Users\Shellfile\Dropbox\Data\cross section 2006.dta"
With the master data being the 12 cross section. My question is, is there a way to check that I have merged the data correctly? Am I wrong to be using append instead of the merge command?
I am relatively new to stata so any help is appreciated.
Thanks.
Comment