Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a year variable

    Hi All

    I'm creating a set of 5 date variables to calculate age of my study participants. The 5 variables corresponding to the years 1982, 1989, 1999, 2009 & 2015. Day & month are missing, so these five variables will only have the year.

    For the first one, in 1982:

    Code:
    gen year36 = 1982
    br ID year36
    gen date36=year(year36)
    
    format date36 %ty
    The variable date36 shows as 1965, so I've missed a step in between somewhere....

    Could someone help me?

    Thanks
    /Amal

  • #2

    year36 is fine as a variable including years. The problem is that

    Code:
    year()
    is a function with the purpose of extracting a year from a daily date. See the fine help:

    year(e_d)
    Description: the numeric year corresponding to date e_d
    Domain e_d: %td dates 01jan0100 to 31dec9999 (integers -679,350 to 2,936,549)
    Range: integers 0100 to 9999 (but probably 1800 to 2100)

    1985 as a daily date was a day in 1965, which explains your result.


    Code:
    . di %td 1985
    08jun1965
    This isn't a step missed. it is an unneeded step, as so far as we can tell year36 is fine for your purposes.

    Comment


    • #3
      Hi Nick

      Thanks for the quick reply.

      But don't I need to inform Stata that 1982 is a 'year' via:

      Code:
       
       gen date36=year(year36)
      The variable indicating birthday (which will be used with the above year36 to calculate age) is in date format:

      Code:
      gen day = 15
      gen month = 3
      gen year = 1946
      
      gen birthdate = mdy(month, day, year)
      format birthdate %d
      Thanks
      /Amal

      Comment


      • #4
        No, no, no.

        year() does what it does and nothing else. It is not only not needed for what you want: it will produce nonsense if fed a literal year value.

        You can if you like format a variable containing years to have a yearly display format but that's empty: nothing will change.

        What you are doing with the birthdays looks fine to me. But strictly there is no such thing as "date format". There are daily date formats, monthly date formats, and several more.

        Comment


        • #5
          Okay - I tried the above (including calculating the age variable), but something is going wrong somewhere:

          Code:
          gen day = 15
          gen month = 3
          gen year = 1946
          br NSHD_ID day month year
          
          gen birthdate = mdy(month, day, year)
          format birthdate %d
          br NSHD_ID day month year birthdate
          
          gen year36 = 1982
          gen year43 = 1989
          gen year53 = 1999
          gen year63 = 2009
          gen year69 = 2015
          
          br ID year36 year43 year53 year63 year69
          
          gen age36 = year36-birthdate
          gen age43 = year43-birthdate
          gen age53 = year53-birthdate
          gen age63 = year63-birthdate
          gen age69 = year69-birthdate
          The age 36 variable has values = 7022 which isn't right. Or do I need to follow-up with formatting the age variable?

          /Amal

          Comment


          • #6
            Formatting is not the issue here. The issue is that you are subtracting a daily date from a yearly date. They are in quite different units.

            I can't see that you need those year variables at all as they are just holding constants -- but in themselves they do no harm.

            The calculation seems circular. Someone was born in 1946, so was 36 in 1982, and so forth. So, what do you want to do?

            Note that something like

            Code:
            gen age = (mdy(12, 31, year) - birthdate) / 365.25
            gives an age on the stated date. 365.25 seems to be acceptable precision for e.g. epidemiology.

            Comment


            • #7
              Hi Nick

              Just following-up from the above - I complicated things. My dataset is longitudinal in the long format, and each individual has 5 rows of data. I just want to create a variable called year which can take on the same 5 values: 1982, 1989, 1999, 2009 & 2015 for subjects and is understood by Stata to be in calendar years. I would I do this:

              Code:
              id year
              1 1982
              1 1989
              1 1999
              1 2009
              1 2015
              2 1982
              2 1989
              2 1999
              2 2009
              2 2015
              3 1982
              3 1989
              3 1999
              3 2009
              3 2015
              Many Thanks
              /Amal

              Comment

              Working...
              X