Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do we tell if a date variable is stored as stata date

    Dear statalisters,

    I have the following variable in my data:

    startdate: type : long. Format %12.0g

    when i open the data, the content appears as 03oct2001.

    gen startyear=year(startdate) , this gives me correct year.

    my question is when do we know if we can apply date function such as year, quarter etc. to a variable . I read the menu (http://www.stata.com/manuals13/ddatetime.pdf), but not completely understand it.

    thanks,
    Rochelle

  • #2
    startdate: type : long. Format %12.0g

    when i open the data, the content appears as 03oct2001
    That's hard to believe.

    Show us the output (pasted directly from the Results window--do not retype it) from
    Code:
    des startdate
    list startdate in 1/5
    It is not credible that a variable formatted %12.0g will display as 03oct2001:
    Code:
    . display %td 15251
    03oct2001
    
    . display %12.0g 15251
           15251
    In terms of the generic question, a Stata internal date will a) not be a string, b) will display looking like a date if you apply a %td format to it, and c) will display as a seemingly meaningless number (somewhere around 20,000 if it is a more or less current date).

    Comment


    • #3
      Clyde's last paragraph applies strictly to daily dates only. The condition b) would need modification for other kinds of dates. This should seem obvious, but even the Stata documentation sometimes carries the implication that dates means daily dates necessarily.

      Comment


      • #4
        Hi Rochelle,

        I suppose, the contradiction: format %12.0g and being displayed as "03oct2001" can be explained by "03oct2001" being a value label.

        As you know, Rochelle, Stata stores "data" internally as integers. If you tell Stata the integers are dates by format %..., then it is assumed that the integers mean: time units having elapsed since 1 january 1960. This is the Stata Internal Form (SIF).

        If the integers in the data at hand have another meaning - as is with the "artkalen"-file of the German socio economic panel data, for instance - then this meaning has to be communicated by value labels.

        You can control if this is the case with your data:
        Code:
         browse startdate, nolabel
        or, for the adherents of abbreviated code:
        Code:
         br startdate, nol
        03oct2001 should then display as 15251 - according to Clyde Shechters post - if it were Stata Internal Form.

        To your original question: if your startdate is not SIF, you can generate a second startdate variable that complies with SIF (adding or subtracting the difference to 01jan1960 in time units) --> if your calculation is correct, the second startdate variable, when formatted as a date, shows the same time as the first one.
        With the second startdate variable you can apply the Stata date functions.

        It may be the case though that the original startdate complies with SIF and has labels nevertheless. Then you just can apply format %td and drop the labels

        Greetings, Klaudia
        Last edited by Klaudia Erhardt; 05 Jun 2015, 02:33.

        Comment


        • #5
          Dear all,

          I apologize for the mistake in my post #1, Clyde was correct, my startdate was not long type

          storage display value
          variable name type format label variable label
          ----------------------------------------------------------------------------------------------------------------------
          startdate int %d


          list startdate in 1/5

          +-----------+
          | startdate |
          |-----------|
          1. | 03oct2001 |
          2. | 10feb1998 |
          3. | 11apr2001 |
          4. | 20aug2002 |
          5. | 27may1998 |
          +-----------+




          I have another variable
          storage display value
          variable name type format label variable label
          ----------------------------------------------------------------------------------------------------------------------
          datadate long %12.0g


          +----------+
          | datadate |
          |----------|
          1. | 19910531 |
          2. | 19920531 |
          +----------+


          would this be the case that this variable is SIF format?

          if so, to get the year and month, what should I do then?


          Thanks again,
          Rochelle

          Comment


          • #6
            No; your variable datadate doesn't have a date format attached. You have it the wrong way round: people can see it's a date, but to Stata it's just a variable with large integer values about 20 million.

            To understand dates and times, there is one and only one safe road: to read

            help dates and times

            until you understand it.

            From experiment you can see that you can convert such a date to a string and then to a daily date:


            Code:
            . di %td  daily(string(19910531, "%8.0f"), "YMD")
            31may1991
            So for you

            Code:
            gen dailydate = daily(string(datadate, "%8.0f"),  "YMD")
            format dailydate %td
            should suffice.

            Comment


            • #7
              Thank you Nick !

              I did search the help date and time, If I may, one question about string,

              is it always the case when a variable that appears as a date, but the storage type is not date, e.g. my datadate,

              the first step is to use string function as you did in post #6? and then use daily function.

              I did as you suggested,



              . gen dailydate = daily(string(datadate, "%8.0f"), "YMD")

              .
              . format dailydate %td

              . gen mon=month(dailydate)



              and got month.
              Last edited by Rochelle Zhang; 05 Jun 2015, 09:16.

              Comment


              • #8
                Date is not a storage type in Stata.

                Dates to be useful are held as numeric variables; and you can use any numeric storage type that works; but in practice dates need date formats to be intelligible.

                daily() expects a string, so that is one way to go from where you started. Your question appears to be: Do you always have to do this? The answer is No. There are other ways.

                For example, you could process dates such as yours by purely numeric operations,

                Code:
                 
                . local date = 19910531
                
                . local year = floor(`date'/10000)
                
                . local day = mod(`date',100)
                
                . local month = floor(`date'/100) - (`year' * 100)
                
                . di %td mdy(`month', `day', `year')
                31may1991
                or the equivalent using generate. But that's messy, here and in general.

                Note that month() delivers month of year, if that's what you want.

                Comment

                Working...
                X