Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate firm age

    Dear Statalisters,

    I have a a panel data (see below). gvkey is firm id

    gvkey datadate
    1001 31dec1978
    1001 31dec1979
    1001 31dec1980
    1001 31dec1981
    1001 31dec1982
    1001 31dec1983
    1001 31dec1984
    1001 31dec1985
    1001 31may1986


    I want to compute firm age based on datadate as the number of years since the firm's ealiest datadate : expect outcome


    firmid datadate age
    1001 31dec1978 0
    1001 31dec1979 1

    ... etc.

    i thought of using
    bysort gvkey: egen firstdate=min(datadate)
    bysort gvkey: egen lastdate=max(datadate)



    not sure if I should just do
    gen age=lastdate-datadate

    Thank you!
    Rochelle


  • #2
    Hello,

    You simply have to correctly ordrer you data, and then using the _n command, combined with by command
    Code:
    sort firmid datadate
    by firmid : gen age=_n
    This would create a variable starting at 1 for the earliest date for each firm.
    If you want your age variable to start at zero simply substract one to the obtain variable.

    Hope this help.
    Charlie

    Comment


    • #3
      Charlie's solution will only work correctly if there are no gaps in the datadate variable. If some years are skipped, then the results will be wrong.

      Rochelle's proposed solution will work, with the understanding that the variable age she calculates gives the age in days. If age in years is desired, just divide by 365.25.

      Also, Rochelle's solution will calculate the age (in days) as of the last observation and place that in every observation for that firm. If she wants a running age that starts at zero and increases over time, then, she can dispense with the lastdate variable and just calculate -gen age = datadate-firstdate- (perhaps divided by 365.25 if she wants age in years.)
      Last edited by Clyde Schechter; 05 Mar 2015, 09:09.

      Comment


      • #4
        Clyde's right, my solution only fits to balanced panels.

        Comment


        • #5
          personage (to be parsed as "person age") from SSC is pertinent once (as in the very first post) you have variables for first and late dates in the same observation.

          Comment


          • #6
            Thank you Charlie and Clyde.

            @Clyde, You are right again. I also need running age.

            I did

            gen age = (datadate-firstdate)/365.25

            as an example
            gvkey datadate fyear age
            1001 31dec1978 1978 0
            1001 31dec1979 1979 .9993156

            for the second row, I expect age=1, but I can't use ceil function because for the case of

            age=2.01, use ceil will give age=3,

            is there another function ?

            Best,
            Rochelle

            Comment


            • #7
              Well, clearly at best 365.25 is an approximation for the average number of days in a year that will be wrong in every individual year. If you want exact calculations, do not use it. As said personage offers exact calculations.

              Comment


              • #8

                I think Nick's comment #7 applies to legal years. In biologic matters there is no better solution than dividing by 365.25.

                Comment


                • #9
                  I'd agree with Svend to the extent that in biostatistics there seems a firmly established convention that dividing #days by 365.25 to get #years is

                  1. More than adequate precision for most scientific and statistical questions (it always being understood that in practice, as with ages of new-born babies, that researchers will work with different units when they are needed)

                  2. A simple habit that if followed uniformly will mean that researchers get the same results easily and without intricate coding.

                  My concern was that Rochelle was saying "That's not exactly right", which is true, and for which there is a remedy.

                  Comment


                  • #10
                    Thank you Svend and Nick for your responses !

                    I accept using
                    gen age = (datadate-firstdate)/365.25


                    Best.
                    Rochelle

                    Comment


                    • #11
                      While Nick's personage seems to suggest that it may be only relevant when calculating a person's age, it in fact calculates an exact age when the age is though out as an integer. Note the difference when applied to Rochelle's data

                      Code:
                      clear
                      input gvkey str20 datadate
                      1001 31dec1978
                      1001 31dec1979
                      1001 31dec1980
                      1001 31dec1981
                      1001 31dec1982
                      1001 31dec1983
                      1001 31dec1984
                      1001 31dec1985
                      1001 31may1986
                      end
                      gen ddate = date(datadate,"DMY")
                      format %td ddate
                      
                      sort gvkey ddate
                      
                      * Because of leap years, this is not quite right
                      by gvkey: gen age = int((ddate-ddate[1])/365.25)
                      
                      * This is better if age in year is integer
                      gen bday = mdy(month(ddate[1]),day(ddate[1]),year(ddate))
                      format %td bday
                      by gvkey: gen age2 = cond(bday <= ddate, ///
                          year(ddate) - year(ddate[1]), ///
                          year(ddate) - year(ddate[1]) - 1)
                      
                      * Using -personage- from SSC
                      by gvkey: gen bday1 = ddate[1]
                      personage bday1 ddate, gen(agey)
                      
                      list, noobs

                      Comment

                      Working...
                      X