Hi all,
I have read throughout several posts that we should always avoid m:m merging in Stata, however, I am in a situation where I think it offers the appropriate solution. Data set A (shown below) is a file that has parents linked to children, meaning that children can be in the data multiple times if they have multiple parents in the data (at times both biological and maybe step). Here, house hold is hhid and opn is people in the household that are not the respondent, in this case children
Data set B (shown below) is a file that has extended information (essentially a variable that would be be k13educ) but only contains HHID and OPN. When I merge these two using "m:m hhid opn", I seem to get the desired result, however, I am concerned that this is not appropriate.
Data set A
Data set B
Data set A
Data set B
I have read throughout several posts that we should always avoid m:m merging in Stata, however, I am in a situation where I think it offers the appropriate solution. Data set A (shown below) is a file that has parents linked to children, meaning that children can be in the data multiple times if they have multiple parents in the data (at times both biological and maybe step). Here, house hold is hhid and opn is people in the household that are not the respondent, in this case children
Data set B (shown below) is a file that has extended information (essentially a variable that would be be k13educ) but only contains HHID and OPN. When I merge these two using "m:m hhid opn", I seem to get the desired result, however, I am concerned that this is not appropriate.
Data set A
Code:
. list hhidpn hhid pn opn k11educ k12educ kidid karel if hhid == "011378" +------------------------------------------------------------------------+ | hhidpn hhid pn opn k11educ k12educ kidid karel | |------------------------------------------------------------------------| 829. | 11378010 011378 010 101 .p .p 0113780101 1.kid | 830. | 11378010 011378 010 102 .p .p 0113780102 1.kid | 831. | 11378020 011378 020 101 .p .p 0113780101 1.kid | 832. | 11378020 011378 020 102 .p .p 0113780102 1.kid | 833. | 11378020 011378 020 151 . . 0113780151 1.kid | +------------------------------------------------------------------------+
Code:
list hhid opn pe029 if hhid == "011378" +----------------------+ | hhid opn pe029 | |----------------------| 340. | 011378 101 . | 341. | 011378 101 . | 342. | 011378 102 . | 343. | 011378 102 . | 344. | 011378 504 . | |----------------------| 345. | 011378 504 . | 346. | 011378 505 . | 347. | 011378 505 . | +----------------------+
Data set A
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long hhidpn str6 hhid str3(pn opn) byte(k11educ k12educ) str10 kidid byte karel 11378010 "011378" "010" "101" .p .p "0113780101" 1 11378010 "011378" "010" "102" .p .p "0113780102" 1 11378020 "011378" "020" "101" .p .p "0113780101" 1 11378020 "011378" "020" "102" .p .p "0113780102" 1 11378020 "011378" "020" "151" . . "0113780151" 1 end label values k11educ EDUC label values k12educ EDUC label values karel RELR label def RELR 1 "1.kid", modify
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str6 hhid str3 opn byte pe029 "011378" "101" . "011378" "101" . "011378" "102" . "011378" "102" . "011378" "504" . "011378" "504" . "011378" "505" . "011378" "505" . end
Comment