Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running firm age

    Dear Statalisters,

    I have a panel data (see below). gvkey is firm id.I had a previous post on this topic, http://www.statalist.org/forums/foru...ulate-firm-age
    The difference is how date is stored (long type).

    bysort gvkey: egen firstdate=min(datadate)
    bysort gvkey: egen lastdate=max(datadate)
    gen running_age= (datadate- firstdate)/365.25

    datadate gvkey
    19731231 1000
    19741231 1000
    19751231 1000
    19761231 1000
    …… (omitted for brevity)
    20101231 293884
    20111231 293884
    20121231 293884
    20121231 294524
    20121231 296318



    sum running_age

    Variable | Obs Mean Std. Dev. Min Max
    -------------+--------------------------------------------------------
    running_age | 99425 418.868 320.4833 27.64682 1722.379

    my question is about the running_age, why is it not in years ?


    Best,
    Rochelle

  • #2
    Evidently your date variable is not a Stata daily date variable and so dividing by 365.25 will give absurd results. For example this calculation should on your logic return 1, or nearly so, but does not:

    Code:
    . di (20121231 - 20111231)/365.25
    27.378508
    Storage type is not the issue here; it is that your way of representing dates entails lots of jumps, e.g. from

    20120131

    to

    20120201

    which is a jump of 70 and (!!!) from a jump of

    20111231

    to

    20120101

    which is a jump of 870, etc. For calculating ages from daily date differences, any variable that increased by 1 (and only 1) every day would suffice, but your variable fails.

    More simply put, there are many fewer days in a year than 10000; hence results will on average be about 10000/365 too high, but there is no simple fix and calculations need to be redone.

    Again, the fact that your date variable looks like a date variable to you is immaterial; Stata thinks of it only as a set of large integers.

    There is a serious repercussion if you ever tsset or xtset in terms of this daily date variable, as many if not all of any results depending on that will be worthless.

    You can map your variable as follows:

    Code:
     
    gen dailydate = daily(string(datadate, "%12.0f"), "YMD")


    Comment


    • #3
      Thank you Nick for enlightening me !

      One question if I may:

      when we see a variable (datadate) that appears like a date, but it is not stored as a date, how do we detect it is not a date variable ? do we run some kind of test?

      Comment


      • #4
        In help datetime, the section "Displaying SIFs in HRF" lists the formats that are used to display Stata Internal Format (SIF) dates in Human Readable Form (HRF).

        If a variable intended to be a date or time, that looks like a date or time when displayed, does not have one of those formats, then it is not a Stata Internal Format date or time. You can see the format in a number of ways, including, for your example, the command describe datadate.

        In the Stata User's Guide, I found Chapter 24, "Working with Dates and Times", to a valuable guide to understanding how Stata handles dates and times. I strongly recommend anyone working with Stata dates and times read it. As does StataCorp: in the section of the Getting Started manual they say the following:
        Ideally, after reading this Getting Started manual, you should read the User’s Guide from cover to cover, but you probably want to become at least somewhat proficient in Stata right away. Here is a suggested reading list of sections from the User’s Guide and the reference manuals to help you on your way to becoming a Stata expert.
        Working with Dates and Times is included in that list.

        Comment


        • #5
          William has answered the last question very well. With the exception of yearly dates, as they usually appear in data for anybody but historians and archaeologists, almost all date handling is tricky in some sense and different programs often use different methods to handle dates. So, read the documentation to find out what you should be doing has to be the main answer. Otherwise it should be clear when you think about it that large integers such as 20150331 can only be treated as daily dates for very limited kinds of problem.

          Comment


          • #6
            Many Thanks to William and Nick ! I will read Chapter 24.

            Rochelle

            Comment

            Working...
            X