Hello!
I'm using Stata 15.
My data consists of 34 000 observations each with 2000+ variables.
I would like to search through one of the variables for all observations, attempting to match it with another variable and then perform an operation if there is a match.
Currently I have ID no per observation (var: PID), ID no of first child (var: bhchild_id1), second child (var: bhchild_id2), third child etc all as separate variables of the same observation.
The children's ID nos are also observed as the identifier (as PID) of their own observation (rather than as a variable in their mother's/father's)
Each child has a dummy variable pertaining to whether or not a child support grant is received on their behalf (var: CSG (0/1)) - currently only coded for childern
I would like to run a loop searching through PID looking for a match from each bhchild_idi number. If a match is found, and CSG=1 for that match, to create a new variable on the parent's PID such as parentCSG =1, similarly parentCSG=0 if if a match is found but CSG=0 for the child.
The data looks something like this;
PID bhchild_id1 bhchild_id2 CSG
555 557 558 .
556 . . .
557 . . 1
558 . . 0
Ideally this would result in a new variable for 555 (the mother) with parentCSG=1, even though they have one child who they do not collect the grant on behalf of.
I have tried multiple loop expressions, combining variables and taking left values, searches through arrays - but it seems my Stata syntax isn't quite where I'd like it to be.
Any help would be much appreciated, if you need more info please let me know!
Thanks in advance
Zac
I'm using Stata 15.
My data consists of 34 000 observations each with 2000+ variables.
I would like to search through one of the variables for all observations, attempting to match it with another variable and then perform an operation if there is a match.
Currently I have ID no per observation (var: PID), ID no of first child (var: bhchild_id1), second child (var: bhchild_id2), third child etc all as separate variables of the same observation.
The children's ID nos are also observed as the identifier (as PID) of their own observation (rather than as a variable in their mother's/father's)
Each child has a dummy variable pertaining to whether or not a child support grant is received on their behalf (var: CSG (0/1)) - currently only coded for childern
I would like to run a loop searching through PID looking for a match from each bhchild_idi number. If a match is found, and CSG=1 for that match, to create a new variable on the parent's PID such as parentCSG =1, similarly parentCSG=0 if if a match is found but CSG=0 for the child.
The data looks something like this;
PID bhchild_id1 bhchild_id2 CSG
555 557 558 .
556 . . .
557 . . 1
558 . . 0
Ideally this would result in a new variable for 555 (the mother) with parentCSG=1, even though they have one child who they do not collect the grant on behalf of.
I have tried multiple loop expressions, combining variables and taking left values, searches through arrays - but it seems my Stata syntax isn't quite where I'd like it to be.
Any help would be much appreciated, if you need more info please let me know!
Thanks in advance
Zac
Comment