Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Longitudinal Survey: Tracking the individuals who belongs to same households

    Hello Everyone,

    I am working on longitudinal data set (Inter generational education mobility) which has 5 waves over the the years. In every wave old respondents are tracked and they added some more respondents as well. Data has been collected in terms of relation with the head of the household, i have data about all the householders living in the same household. Moreover, the data resolves the co-resident problem as well. As the respondents are asked to provide information (education and income) about their parents/siblings/child, dead or alive, belongs to same households but living outside. Household identifier is HHID and individuals are tracked through PIDLINK.

    Here is my question.

    For first wave which was conducted in year 2000, i have four data file for instance in first file i have education information for all the householders, in second file those parents who are not living in the households, another file on siblings living outside and last one on children living outside. In every file i have HHID and pidlink(unique identifier). How can i use all these four files to make pairs? do i need to merge all four files by using pidlink ? It will automatically add education information and relation to head information in the same households for those who were living outside?


    My extreme apologies if the post is not in accordance with forum rule as am new here so i just started reading the rules. Thanks

  • #2
    I find your question unclear. I don't understand what the four data files look like, and I don't understand what you want the result to look like. I suggest you use the -dataex- command to post short examples of data from each of the four files (preferably including some records that you want to match up). Then, by hand, make up a data set that looks like the results you want to see, and post an example of that.

    If you are not familiar with -dataex-, you will find information about it in FAQ #12.

    Comment


    • #3
      Thank you for the reply. Let me rephrase my question with data example and simplified case. This is the first wave of my data now, for which there are two files. First file contains information on household identity (hhid), individual identifier (pidlink), relation with head (rwhead), age , highest sch attended, and grade completed.
      If “rwhead=6” (in first data file) it means parents and child living in the same hh, although their son is the head of the hh. But in the second file head of the households or spouse of the head are asked if their parents are not living in the same hh? In that case they are asked about their age, dead or alive and same variable as above. I want to use this information so I can construct father, son pair to study the intergenerational mobility without co-resident problem in the data. I want to make one file and shape of the data should be same as first file.

      Please suggest if I need to merge the data or append? Which identifier I should use? Here below are two data files respectively.


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(rwhead sex age mstatus religion highest_sch_att grade_comp currently_studying) str9(hhid pidlink) byte area
      1 1 33 2 2 2  3 6 "120100106" "001060001" 2
      2 3 30 2 2 2 96 6 "120100106" "001060002" 2
      1 1 40 2 2 2  1 6 "120100108" "001080001" 2
      2 3 38 2 2 2  1 6 "120100108" "001080002" 2
      6 3 60 5 2 1 96 6 "120100108" "001080011" 2
      1 1 37 2 2 2  7 6 "120100122" "001220001" 2
      2 3 28 2 2 2  3 6 "120100122" "001220002" 2
      1 1 42 2 2 2  2 6 "120100124" "001240001" 2
      2 3 40 2 2 2  3 6 "120100124" "001240002" 2
      1 1 45 2 2 6 96 6 "120100125" "001250001" 2
      2 3 36 2 2 1 96 6 "120100125" "001250002" 2
      1 1 37 2 2 2  4 6 "120100129" "001290001" 2
      2 3 34 2 2 1 96 6 "120100129" "001290002" 2
      1 1 56 2 3 1 96 6 "120100201" "002010001" 2
      2 3 46 2 3 1 96 6 "120100201" "002010002" 2
      3 3 26 2 3 1 96 6 "120100201" "002010003" 2
      5 1 37 2 3 3  2 6 "120100201" "002010004" 2
      6 3 48 5 3 1 96 6 "120100202" "002020003" 2
      1 1 70 2 2 2 96 6 "120100203" "002030001" 2
      2 3 50 2 2 1 96 6 "120100203" "002030002" 2
      1 1 30 2 2 2 96 6 "120100205" "002050001" 2
      2 3 27 2 2 2  2 6 "120100205" "002050002" 2
      1 3 98 5 2 1 96 6 "120100206" "002060001" 2
      1 1 42 2 3 1 96 6 "120100209" "002090001" 2
      6 3 53 5 3 1 96 6 "120100209" "002090005" 2
      1 1 60 2 2 2  1 6 "120100211" "002110001" 2
      2 3 52 2 2 2  2 6 "120100211" "002110002" 2
      1 1 43 2 3 2 96 6 "120100212" "002120001" 2
      2 3 28 2 3 2  4 6 "120100212" "002120002" 2
      1 1 35 2 3 1 96 6 "120100213" "002130001" 2
      2 3 32 2 3 1 96 6 "120100213" "002130002" 2
      1 1 29 2 2 2  1 6 "120100214" "002140001" 2
      1 1 43 2 2 2  5 6 "120100215" "002150001" 2
      2 3 32 2 2 2  2 6 "120100215" "002150002" 2
      7 3 86 4 2 1 96 6 "120100215" "002150004" 2
      1 1 71 2 2 2 96 6 "120100216" "002160001" 2
      1 1 40 2 2 1 96 6 "120100217" "002170001" 2
      2 3 30 2 2 1 96 6 "120100217" "002170002" 2
      1 1 30 2 2 1 96 6 "120100218" "002180001" 2
      8 1 27 2 2 1 96 6 "120100218" "002180002" 2
      8 3 28 2 2 1 96 6 "120100218" "002180003" 2
      6 1 65 5 2 1 96 6 "120100218" "002180004" 2
      1 3 60 5 2 1 96 6 "120100219" "002190001" 2
      1 1 33 2 3 6  7 6 "120100220" "002200001" 2
      1 1 50 2 2 1 96 6 "120100221" "002210001" 2
      2 3 44 2 2 1 96 6 "120100221" "002210002" 2
      1 1 32 2 3 1 96 6 "120100222" "002220001" 2
      2 3 29 2 3 1 96 6 "120100222" "002220002" 2
      6 3 58 5 3 1 96 6 "120100222" "002220003" 2
      1 1 40 2 2 1 96 6 "120100223" "002230001" 2
      2 3 50 2 2 1 96 6 "120100223" "002230002" 2
      1 1 40 2 2 2  7 6 "120100224" "002240001" 2
      2 3 36 2 2 1 96 6 "120100224" "002240002" 2
      1 1 28 2 2 2  4 6 "120100225" "002250001" 2
      2 3 27 2 2 2  1 6 "120100225" "002250002" 2
      6 3 48 5 2 1 96 6 "120100225" "002250007" 2
      1 1 50 2 2 1 96 6 "120100226" "002260001" 2
      2 3 40 2 2 1 96 6 "120100226" "002260002" 2
      1 1 65 2 2 1 96 6 "120100227" "002270001" 2
      2 3 50 2 2 2  3 6 "120100227" "002270002" 2
      1 1 64 2 3 1 96 6 "120100228" "002280001" 2
      2 3 62 2 3 1 96 6 "120100228" "002280002" 2
      1 1 26 2 3 2  2 6 "120100229" "002290001" 2
      6 1 70 2 3 1 96 6 "120100229" "002290005" 2
      6 3 60 2 3 1 96 6 "120100229" "002290006" 2
      1 1 62 2 2 1 96 6 "120100230" "002300001" 2
      2 3 58 2 2 1 96 6 "120100230" "002300002" 2
      1 1 45 2 1 2  7 6 "120200301" "003010001" 2
      2 3 35 2 1 2  3 6 "120200301" "003010002" 2
      1 1 26 2 1 2  3 6 "120200302" "003020001" 2
      2 3 26 2 1 2  4 6 "120200302" "003020002" 2
      1 3 52 5 1 2  4 6 "120200303" "003030001" 2
      3 1 26 1 1 9  7 6 "120200303" "003030002" 2
      1 1 56 2 1 9  1 6 "120200305" "003050001" 2
      2 3 42 2 1 2  7 6 "120200305" "003050002" 2
      1 3 85 5 1 1 96 6 "120200306" "003060001" 2
      3 3 56 4 1 2  1 6 "120200306" "003060002" 2
      2 3 30 2 1 3  7 6 "120200307" "003070002" 2
      1 1 36 2 1 2  7 6 "120200308" "003080001" 2
      2 3 31 2 1 2  7 6 "120200308" "003080002" 2
      1 1 43 2 1 6  7 6 "120200309" "003090001" 2
      2 3 48 2 1 5  7 6 "120200309" "003090002" 2
      1 1 57 2 1 2  4 6 "120200310" "003100001" 2
      2 3 48 2 1 2  3 6 "120200310" "003100002" 2
      2 3 27 2 1 2  7 6 "120200311" "003110002" 2
      1 1 26 2 1 5  7 6 "120200312" "003120001" 2
      2 3 29 2 1 3  1 6 "120200312" "003120002" 2
      1 1 35 2 1 2  5 6 "120200313" "003130001" 2
      2 3 26 2 1 3  7 6 "120200313" "003130002" 2
      1 1 48 2 1 2  3 6 "120200314" "003140001" 2
      2 3 47 2 1 2  7 6 "120200314" "003140002" 2
      1 3 43 5 1 2  7 6 "120200315" "003150001" 2
      1 3 65 5 1 2  1 6 "120200316" "003160001" 2
      1 3 39 5 1 3  1 6 "120200317" "003170001" 2
      1 1 36 2 1 2  7 6 "120200318" "003180001" 2
      2 3 40 2 1 6  7 6 "120200318" "003180002" 2
      1 1 41 2 1 6  7 6 "120200319" "003190001" 2
      2 3 38 2 1 4  7 6 "120200319" "003190002" 2
      1 1 62 2 1 2  7 6 "120200320" "003200001" 2
      2 3 57 2 1 3  7 6 "120200320" "003200002" 2
      end
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(age grade_comp) str9(hhid pidlink) float(highest_sch_att rwhead)
      68  3 "120100106" "001060001" 10 54
      60  3 "120100108" "001080001"  2 54
      57 98 "120100122" "001220001"  2 54
      65  7 "120100122" "001220002"  2 54
      40 96 "120100124" "001240001"  1 54
      50 96 "120100124" "001240002"  1 54
      95 96 "120100125" "001250001"  1 54
      50 96 "120100125" "001250002"  1 54
      68 96 "120100129" "001290001"  1 54
      70 96 "120100201" "002010001"  1 54
      70 96 "120100201" "002010002"  1 54
      40 96 "120100203" "002030002"  1 54
      70  2 "120100204" "002040001"  2 54
      60  4 "120100204" "002040002"  2 54
      60 96 "120100205" "002050001"  1 54
      60 96 "120100209" "002090001"  1 54
      48 96 "120100210" "002100001"  1 54
      58 96 "120100210" "002100002"  1 54
      70 96 "120100211" "002110001"  1 54
      80  7 "120100211" "002110002"  2 54
      85 96 "120100212" "002120001"  1 54
      70 96 "120100212" "002120002"  1 54
      76 96 "120100214" "002140001"  1 54
      55  1 "120100214" "002140002"  2 54
      95 96 "120100215" "002150002"  1 54
      70 96 "120100217" "002170002"  1 54
      60 96 "120100220" "002200001"  1 54
      70 96 "120100222" "002220001"  1 54
      50 96 "120100222" "002220002"  1 54
      70 96 "120100223" "002230002"  1 54
      90 98 "120100224" "002240001" 10 54
      95 96 "120100224" "002240002"  1 54
      90 96 "120100225" "002250001"  1 54
      70 96 "120100225" "002250002"  1 54
      50 96 "120100226" "002260001"  1 54
      60 96 "120100227" "002270001"  1 54
      66 96 "120100228" "002280001"  1 54
      95 96 "120100228" "002280002"  1 54
      53 96 "120100229" "002290002"  1 54
      50 96 "120100230" "002300001"  1 54
      50 96 "120100230" "002300002"  1 54
      75 98 "120200301" "003010001" 10 54
      65  4 "120200301" "003010002" 10 54
      63 98 "120200303" "003030001" 98 54
      65  4 "120200304" "003040001"  2 54
      58  7 "120200304" "003040002"  2 54
      75  3 "120200305" "003050001"  2 54
      90 98 "120200305" "003050002" 10 54
      95 96 "120200306" "003060002"  1 54
      70  7 "120200307" "003070001"  2 54
      75 98 "120200307" "003070002" 98 54
      76  7 "120200308" "003080001"  2 54
      58 98 "120200308" "003080002" 98 54
      86  7 "120200309" "003090001" 10 54
      60 98 "120200309" "003090002" 98 54
      95 98 "120200310" "003100001" 10 54
      50 98 "120200311" "003110001" 98 54
      60  7 "120200311" "003110002"  5 54
      76 98 "120200312" "003120001"  2 54
      60 96 "120200312" "003120002"  1 54
      60  7 "120200313" "003130001"  2 54
      41 98 "120200313" "003130002"  2 54
      40 98 "120200314" "003140001" 98 54
      70 96 "120200314" "003140002"  1 54
      70  2 "120200315" "003150001" 10 54
      70 98 "120200317" "003170001"  2 54
      59  7 "120200318" "003180002"  2 54
      70  7 "120200319" "003190001" 10 54
      70 98 "120200319" "003190002" 98 54
      70 96 "120200320" "003200001"  1 54
      70  7 "120200320" "003200002"  2 54
      93  3 "120200321" "003210001"  2 54
      60  7 "120200322" "003220001" 10 54
      70 98 "120200322" "003220002" 10 54
      60 96 "120200323" "003230001"  1 54
      70 98 "120200324" "003240001"  2 54
      65  3 "120200326" "003260001" 10 54
      84 98 "120200326" "003260002" 98 54
      45  2 "120200327" "003270002"  2 54
      60  3 "120200328" "003280001" 10 54
      70 98 "120200328" "003280002" 98 54
      65 98 "120200329" "003290001" 98 54
      45 96 "120200330" "003300001"  1 54
      51  7 "120400401" "004010001"  5 54
      55 96 "120400402" "004020002"  1 54
      67  7 "120400403" "004030001"  3 54
      75 96 "120400403" "004030002"  1 54
      75  3 "120400404" "004040001"  2 54
      60  1 "120400404" "004040002"  2 54
      72 96 "120400405" "004050001"  1 54
      45  4 "120400405" "004050002"  2 54
      70 96 "120400406" "004060001"  1 54
      80  7 "120400406" "004060002"  2 54
      55 99 "120400411" "004110001"  5 54
      85 96 "120400411" "004110002"  1 54
      34 98 "120400412" "004120001"  2 54
      80  1 "120400413" "004130001"  2 54
      35 96 "120400413" "004130002"  1 54
      70  4 "120400414" "004140002"  2 54
      60 96 "120400415" "004150001"  1 54
      end

      Comment


      • #4
        Thank you for posting the data. The situation is a bit clearer, but I still don't see enough information to propose a solution. There are several aspects of the data that puzzle me.

        1. The combination of hhid and pidlink do not uniquely identify individuals. For example, hhid 120100106 and pidlink 00106001 are used for two people, one of whom is 33 years old and the other 68, and they also have different values of highest schooling. This is not an odd example: there are many like this. In fact, most of the hhid-pidlink combinations refer to more than one person.

        2. Even assuming there is some other variable in the data that actually enables you to pinpoint individuals, I don't see how you would know who is whose father or child, etc.

        I have the sense at this point the Stata coding is not the obstacle here. I think you need to get a much clearer understanding of how the data in these files is organized, what the codings of the variables mean, and what variables can link people in one file to people in the other files in different ways. I think when you have all of that clear, figuring out how to code it will be straightforward. To work all of that out, though, you will need to consult with somebody who is familiar with this particular survey.

        Comment

        Working...
        X