Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linking person-year records across different people (using NLSY97)

    Aloha,
    I am using the NLSY97 and doing a project where I need to link siblings within the data. I have it so my observations are as follows:

    ID age sibid year older hhid
    1 x_1 2 1997 0 1
    1 x_2 2 1998 0 1
    2 y_1 1 1997 1 1
    2 y_2 1 1998 1 1
    etc.

    The "older" variable simply indicates if that sibling is the oldest within the household.

    I would like to add another variable "age_older" so that I have the age of the older sibling in the same person-year observation as the younger sibling. I imagine the data to look as follows:

    ID age sibid age_older year older hhid
    1 x_1 2 y_1 1997 0 1
    1 x_2 2 y_2 1998 0 1
    2 y_1 1 0__ 1997 1 1
    2 y_2 1 0__ 1998 1 1
    etc.

    Two questions: how do I create a command to do this? Second: should the observations for "age_older" of the older sibling (seen above in the latter two person-year observations) be 0 or "." for missing? It really is an "N/A".

    I will be using this in a regression discontinuity design, with the cutoff based on the age of the older sibling.

    Thank you!

  • #2
    Code:
    clear
    input byte id str3 age byte sibid int year byte(older hhid)
    1 "x_1" 2 1997 0 1
    1 "x_2" 2 1998 0 1
    2 "y_1" 1 1997 1 1
    2 "y_2" 1 1998 1 1
    end
    
    frame put id hhid year age if older, into(older_sibs)
    frlink m:1 hhid sibid year, frame(older_sibs hhid id year)
    frget age_older = age, from(older_sibs)
    Note: Requires versions 16 or 17 as frames are used.

    It does not make sense to code a not-applicable result as 0 or any other number. This is precisely what missing values are for. The above code does that automatically.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Comment


    • #3
      Clyde Schechter : Thank you for your response! I will use -dataex- in the future; this was my first post. I am currently using Stata 15: would you suggest another method, or to just bite the bullet and upgrade?

      Comment


      • #4
        Code:
        preserve
        keep if older
        keep id hhid year age
        rename  id sibid
        rename age age_older
        tempfile older_sibs
        save `older_sibs'
        
        restore
        merge m:1 hhid sibid year using `older_sibs'
        will work in version 15. Remember that tempfiles and local macros go out of existence after execution of a do-file or program, or a block of lines from a do-file. So resist the temptation to "see what the code is doing" by running it line-by-line: done that way, it will do nothing at all. You must run this code in one fell swoop from beginning to end.

        The logic of this code is the same as the original; it just uses a -tempfile- instead of a frame. That means we have to do some dancing with the variable names that can be circumvented with frames. And if you have a gargantuan dataset, the performance will be slower--but I doubt with the NLSY it will be noticeable. (For those unfamiliar with NLSY it's the National Longitudinal Survey of Youth, carried out periodically in the United States.)

        In the future, if you are not using the current version of Stata, it is best to state which version you are using so you get usable code the first time. That said, I think there have been a lot of really good improvements to Stata since version 15, so I would encourage you to upgrade.

        Comment

        Working...
        X