Dear stata community,
I am fairly new to STATA and am trying to wrap my head around creating new variables from the content of the partner observation.
I have a wide dataset where each observation is a single individual. let's say it looks like the following.
ID PartnerID Var1 Var2
1 5 “hi” “ho”
2 3 “cat” “flower”
3 2 “bird” “stone”
4 . “Frog” “cycle”
5 1 “Jupiter” “lollipop”
Now I am attempting to generate each partner's variables as variable's upon the index-person's observation. Like this:
ID PartnerID Var1 PartnerVar1 Var2 PartnerVar2
1 5 “hi” “Jupiter” “ho” “lollipop“
2 3 “cat” “bird” “flower” “stone”
3 2 “bird” “cat” “stone” “flower”
4 . “Frog” . “cycle” ""
5 1 “Jupiter” “hi” “lollipop” “ho”
The following syntax worked fine initially:
gen PartnerVar1 = Var1[PartnerID]
Yet, it is dependant on the ID-variable being a steady sequence without interruptions. If it breaks (e.g. 123 5678 10 15) there will be a mismatch.
Do any of you have suggestions on how to match, not by row number but by the content of PartnerID? Preferably without using a Foreach/Forval loop as there are aprox. 70 variables and 1,000,000 observations.
Kind regards,
Joel
I am fairly new to STATA and am trying to wrap my head around creating new variables from the content of the partner observation.
I have a wide dataset where each observation is a single individual. let's say it looks like the following.
ID PartnerID Var1 Var2
1 5 “hi” “ho”
2 3 “cat” “flower”
3 2 “bird” “stone”
4 . “Frog” “cycle”
5 1 “Jupiter” “lollipop”
Now I am attempting to generate each partner's variables as variable's upon the index-person's observation. Like this:
ID PartnerID Var1 PartnerVar1 Var2 PartnerVar2
1 5 “hi” “Jupiter” “ho” “lollipop“
2 3 “cat” “bird” “flower” “stone”
3 2 “bird” “cat” “stone” “flower”
4 . “Frog” . “cycle” ""
5 1 “Jupiter” “hi” “lollipop” “ho”
The following syntax worked fine initially:
gen PartnerVar1 = Var1[PartnerID]
Yet, it is dependant on the ID-variable being a steady sequence without interruptions. If it breaks (e.g. 123 5678 10 15) there will be a mismatch.
Do any of you have suggestions on how to match, not by row number but by the content of PartnerID? Preferably without using a Foreach/Forval loop as there are aprox. 70 variables and 1,000,000 observations.
Kind regards,
Joel
Comment