I have used PSID tools to create a panel consisting exclusively of family-level variables. I am attempting to clean my data by dropping the entire subject when I need to drop observations to make sure I do not unbalance my panel (already unbalanced everything once). For example, I am trying to drop every subject that has a change in family composition (famcompch > 0 for any year).
Variables if interest are:
x11101ll = ID variable
wave = year variable
famcompch = the variable I am attempting to clean
Here is an example of some of the variables from dataex:
input long x11101ll int(wave x11102) byte(xsqnr famcompch)
4003 1 . . .
4003 3 . . .
4003 5 . . .
4003 7 . . .
4003 1999 2 1 0
4003 2001 96 1 1
4003 2003 1392 1 0
4003 2005 289 1 0
4003 2007 148 1 0
4004 1 . . .
4004 3 . . .
4004 5 . . .
4004 7 . . .
4004 1999 6129 1 3
4004 2001 5987 1 0
4004 2003 6278 1 0
4004 2005 2356 1 0
4004 2007 5399 1 0
4006 1 . . .
4006 3 . . .
4006 5 . . .
4006 7 . . .
4006 1999 4920 2 0
4006 2001 5599 2 0
4006 2003 4812 1 3
4006 2005 4097 1 0
4006 2007 720 1 0
4031 1 . . .
4031 3 . . .
4031 5 . . .
4031 7 . . .
4031 1999 1702 1 0
4031 2001 285 2 4
4031 2003 1427 2 0
4031 2005 1157 2 0
4031 2007 196 2 0
4033 1 . . .
4033 3 . . .
4033 5 . . .
4033 7 . . .
4033 1999 2 4 0
4033 2001 5479 1 5
4033 2003 6061 1 2
4033 2005 641 1 0
4033 2007 189 1 0
4039 1 . . .
4039 3 . . .
4039 5 . . .
4039 7 . . .
4039 1999 2 3 0
4039 2001 96 3 1
4039 2003 1392 3 0
4039 2005 289 3 0
4039 2007 148 3 0
I have read probably every thread on statalist and google about how to accomplish my goal, yet I still end up dropping my entire dataset if I use -bysort-, or just not getting results if I use -egen-. After reading -help by- I've tried my own hand at this, and I always get something like:
. bysort x11101ll (famcompch): keep if famcompch[_N] == 0
(126,207 observations deleted)
Help!
Variables if interest are:
x11101ll = ID variable
wave = year variable
famcompch = the variable I am attempting to clean
Here is an example of some of the variables from dataex:
input long x11101ll int(wave x11102) byte(xsqnr famcompch)
4003 1 . . .
4003 3 . . .
4003 5 . . .
4003 7 . . .
4003 1999 2 1 0
4003 2001 96 1 1
4003 2003 1392 1 0
4003 2005 289 1 0
4003 2007 148 1 0
4004 1 . . .
4004 3 . . .
4004 5 . . .
4004 7 . . .
4004 1999 6129 1 3
4004 2001 5987 1 0
4004 2003 6278 1 0
4004 2005 2356 1 0
4004 2007 5399 1 0
4006 1 . . .
4006 3 . . .
4006 5 . . .
4006 7 . . .
4006 1999 4920 2 0
4006 2001 5599 2 0
4006 2003 4812 1 3
4006 2005 4097 1 0
4006 2007 720 1 0
4031 1 . . .
4031 3 . . .
4031 5 . . .
4031 7 . . .
4031 1999 1702 1 0
4031 2001 285 2 4
4031 2003 1427 2 0
4031 2005 1157 2 0
4031 2007 196 2 0
4033 1 . . .
4033 3 . . .
4033 5 . . .
4033 7 . . .
4033 1999 2 4 0
4033 2001 5479 1 5
4033 2003 6061 1 2
4033 2005 641 1 0
4033 2007 189 1 0
4039 1 . . .
4039 3 . . .
4039 5 . . .
4039 7 . . .
4039 1999 2 3 0
4039 2001 96 3 1
4039 2003 1392 3 0
4039 2005 289 3 0
4039 2007 148 3 0
I have read probably every thread on statalist and google about how to accomplish my goal, yet I still end up dropping my entire dataset if I use -bysort-, or just not getting results if I use -egen-. After reading -help by- I've tried my own hand at this, and I always get something like:
. bysort x11101ll (famcompch): keep if famcompch[_N] == 0
(126,207 observations deleted)
Help!
Comment