I'm using PSID tools. I cleaned all my data, ran some numbers, and was an almost happy camper until I discovered that (1) unbalanced and hacked my panel apart using -drop- (see my previous post "Panel Data 1"), and (2) I only wanted family-level variables for my family-level unit of analysis, but PSID tools effectively imports a panel of individuals and grafts the family-level data onto each person, meaning that I have a panel consisting of every individual in every family, with repetitive family-level results grafted onto each individual. Specifically, and by way of example, I have four identical family income observations for 4 person family A in year x, and one family income observation for 1 person family B in year x. I just want one observation for each family in each year.
How do I solve this second problem? Here are the variables:
x11101II = person identification number, unique to each person
wave = year
x11102 = 1999 interview number. This is a family-level ID number, but it changes every year. Within a family unit, person IDs will have the same 1999 interview number in a given year. But that interview number will go to a different family in a different year. I have 5 waves: 1999, 2001, 2003, 2005, 2007.
xsqnr = sequence number. As far as I can tell, this is used for multifamily households to identify who was interviewed in what order.
So for any given year, the family-level info is all the same for each family member, i.e. for family 9 I have four identical family income observations with different sequence numbers and different personal ID numbers, and four identical age of head of household observations, etc.
As mentioned before, I just want one set of observations per family. Any help dropping extra family members is much appreciated.
In my previous post, I failed at dataex despite reading help dataex, but I will nonetheless try again to use dataex to post my relevant variables here:
How do I solve this second problem? Here are the variables:
x11101II = person identification number, unique to each person
wave = year
x11102 = 1999 interview number. This is a family-level ID number, but it changes every year. Within a family unit, person IDs will have the same 1999 interview number in a given year. But that interview number will go to a different family in a different year. I have 5 waves: 1999, 2001, 2003, 2005, 2007.
xsqnr = sequence number. As far as I can tell, this is used for multifamily households to identify who was interviewed in what order.
So for any given year, the family-level info is all the same for each family member, i.e. for family 9 I have four identical family income observations with different sequence numbers and different personal ID numbers, and four identical age of head of household observations, etc.
As mentioned before, I just want one set of observations per family. Any help dropping extra family members is much appreciated.
In my previous post, I failed at dataex despite reading help dataex, but I will nonetheless try again to use dataex to post my relevant variables here:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long x11101ll int(wave x11102) byte(xsqnr state famcompch) 4003 1 . . . . 4003 3 . . . . 4003 5 . . . . 4003 7 . . . . 4003 1999 2 1 41 0 4003 2001 96 1 41 1 4003 2003 1392 1 41 0 4003 2005 289 1 41 0 4003 2007 148 1 41 0 4004 1 . . . . 4004 3 . . . . 4004 5 . . . . 4004 7 . . . . 4004 1999 6129 1 41 3 4004 2001 5987 1 41 0 4004 2003 6278 1 41 0 4004 2005 2356 1 41 0 4004 2007 5399 1 41 0 4006 1 . . . . 4006 3 . . . . 4006 5 . . . . 4006 7 . . . . 4006 1999 4920 2 15 0 4006 2001 5599 2 15 0 4006 2003 4812 1 15 3 4006 2005 4097 1 15 0 4006 2007 720 1 41 0 4031 1 . . . . 4031 3 . . . . 4031 5 . . . . 4031 7 . . . . 4031 1999 1702 1 41 0 4031 2001 285 2 41 4 4031 2003 1427 2 41 0 4031 2005 1157 2 41 0 4031 2007 196 2 41 0 4033 1 . . . . 4033 3 . . . . 4033 5 . . . . 4033 7 . . . . 4033 1999 2 4 41 0 4033 2001 5479 1 41 5 4033 2003 6061 1 41 2 4033 2005 641 1 41 0 4033 2007 189 1 41 0 4039 1 . . . . 4039 3 . . . . 4039 5 . . . . 4039 7 . . . . 4039 1999 2 3 41 0 4039 2001 96 3 41 1 4039 2003 1392 3 41 0 4039 2005 289 3 41 0 4039 2007 148 3 41 0 4041 1 . . . . 4041 3 . . . . 4041 5 . . . . 4041 7 . . . . 4041 1999 1702 2 41 0 4041 2001 285 3 41 4 4041 2003 1427 3 41 0 4041 2005 1157 3 41 0 4041 2007 196 3 41 0 4042 1 . . . . 4042 3 . . . . 4042 5 . . . . 4042 7 . . . . 4042 1999 1702 3 41 0 4042 2001 285 4 41 4 4042 2003 1427 4 41 0 4042 2005 1157 4 41 0 4042 2007 196 4 41 0 4173 1 . . . . 4173 3 . . . . 4173 5 . . . . 4173 7 . . . . 4173 1999 2 2 41 0 4173 2001 96 2 41 1 4173 2003 1392 2 41 0 4173 2005 289 2 41 0 4173 2007 148 2 41 0 4180 1 . . . . 4180 3 . . . . 4180 5 . . . . 4180 7 . . . . 4180 1999 3818 3 41 4 4180 2001 5964 3 41 1 4180 2003 6443 2 41 6 4180 2005 771 2 41 0 4180 2007 1130 1 41 3 5002 1 . . . . 5002 3 . . . . 5002 5 . . . . 5002 7 . . . . 5002 1999 376 1 41 0 5002 2001 444 1 41 1 5002 2003 3724 1 41 1 5002 2005 1654 1 41 0 5002 2007 1210 1 41 0 5003 1 . . . . end
Comment