Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting R date into STATA date

    Hi,

    I am trying to convert R dates into STATA dates but unable to do so.

    For example, the R date in POSIXct format is 1231084800 this is number of seconds from 1970-01-01 00:00. The actual date in MDY format is "Jan 4th, 2009"

    I followed the process described here - https://www.stata.com/manuals/ddatet...ersoftware.pdf

    generate double statatime = rtime - tC(01jan1970 00:00)
    format statatime %tC

    But STATA gives the answer as "14jan1950 05:58:04" which is wrong.

    How to I get the correct date and how can I fetch month, year, day values from the date?

    Thanks!


    Last edited by Prachi Singh; 22 Nov 2021, 00:20.

  • #2
    Hi,

    I have been able to partially solve my problem but still facing issues.

    Below is the code I used to convert R dates into STATA dates and then extracted 3 components which I needed: day of the year, year, hour.

    *-----------------------------------------------*
    input date
    1231084800
    1230768000
    1230840000
    end
    gen sdate1 = date*1000 + mdyhms(1,1,1970,0,0,0)
    format sdate %tC
    gen sdate2 = dofc(sdate1)
    format sdate2 %td
    gen hh = hh(sdate1)
    gen day = doy(sdate2)
    gen year = year(sdate2)
    *-----------------------------------------------*

    The three dates (1231084800, 1230768000, 1230840000) I entered are 4th Jan, 2009 16:00 GMT, 1st Jan 2009 00:00 GMT & 1st Jan 2009 20:00 GMT.

    But STATA gives incorrect answers by offsetting the dates by few seconds, for example: the first date is captured by STATA as "04jan, 2009 15:59:30", the second date is captured by STATA as "31dec, 2008 23:59:30".

    As a result when I extract the hour then the answer is 15 for the first date (while it should be 16) and similar errors for other dates as well.

    ​​​​​​​How do I address this problem?



    Comment


    • #3
      Stata still defaults to storing numeric variables in float precision; you want double. Change

      Code:
      gen sdate1 = date*1000 + mdyhms(1,1,1970,0,0,0)
      to

      Code:
      gen double sdate1 = date*1000 + mdyhms(1,1,1970,0,0,0)
      Also, you want %tc format, not %tC format (i.e., not adjusting for leap seconds).

      Comment


      • #4
        Originally posted by Prachi Singh View Post
        How do I address this problem?
        .ÿ
        .ÿversionÿ17.0

        .ÿ
        .ÿclearÿ*

        .ÿ
        .ÿinputÿdoubleÿdate

        ÿÿÿÿÿÿÿÿÿÿÿdate
        ÿÿ1.ÿ1231084800
        ÿÿ2.ÿ1230768000
        ÿÿ3.ÿ1230840000
        ÿÿ4.ÿend

        .ÿ
        .ÿgenerateÿdoubleÿsdate1ÿ=ÿdateÿ*ÿ1000ÿ+ÿmdyhms(1,ÿ1,ÿ1970,ÿ0,ÿ0,ÿ0)

        .ÿformatÿsdateÿ%tcCCYY-NN-DD_HH:MM:SS

        .ÿgenerateÿintÿsdate2ÿ=ÿdofc(sdate1)

        .ÿformatÿsdate2ÿ%tdCCYY-NN-DD

        .ÿgenerateÿdoubleÿhhÿ=ÿhh(sdate1)

        .ÿgenerateÿbyteÿdayÿ=ÿdoy(sdate2)

        .ÿgenerateÿintÿyearÿ=ÿyear(sdate2)

        .ÿ
        .ÿlist,ÿnoobs

        ÿÿ+----------------------------------------------------------------+
        ÿÿ|ÿÿÿÿÿÿdateÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsdate1ÿÿÿÿÿÿÿsdate2ÿÿÿhhÿÿÿdayÿÿÿyearÿ|
        ÿÿ|----------------------------------------------------------------|
        ÿÿ|ÿ1.231e+09ÿÿÿ2009-01-04ÿ16:00:00ÿÿÿ2009-01-04ÿÿÿ16ÿÿÿÿÿ4ÿÿÿ2009ÿ|
        ÿÿ|ÿ1.231e+09ÿÿÿ2009-01-01ÿ00:00:00ÿÿÿ2009-01-01ÿÿÿÿ0ÿÿÿÿÿ1ÿÿÿ2009ÿ|
        ÿÿ|ÿ1.231e+09ÿÿÿ2009-01-01ÿ20:00:00ÿÿÿ2009-01-01ÿÿÿ20ÿÿÿÿÿ1ÿÿÿ2009ÿ|
        ÿÿ+----------------------------------------------------------------+

        .ÿ
        .ÿexit

        endÿofÿdo-file


        .


        There's a StataCorp FAQ that doesn't exist, but ought to. It reads something to the effect of, We've discovered an oversight in our installation instructions that we rectify here in this erratum, to wit, the last step that every customer needs to perform in order to complete the installation of Stata properly is to execute the following at the command line:
        Code:
        set type double, permanently

        Comment


        • #5
          I don't agree with Joseph Coveney. float works fine as a default type for me (and for many users). But it is documented again and again that date-times should be stored as doubles, notably within

          Code:
          help datetime
          It is really hard for Stata to know if users are ignoring this from the outset, as in #2, where the implied instruction is to input as float.

          Comment


          • #6
            Although beyond the original question, I would like to add that I lean towards making double the default storage type. Despite excellent documentation, I have repeatedly seen questions from users that arise because of the default storage type begin float rather than double; I cannot, however, recall even a single question or problem that relates to wasted memory when storing variables, which is arguably the only downside to double precision.

            Comment


            • #7
              It's also about the posts that would arise if the default were double and people were experiencing problems because they didn't know about it and were bitten thereby.

              This one has been debated back and forth among users for years, nay decades, and I doubt there will be movement. The bottom line is that if you want a default of double you can have it. but StataCorp isn't forcing it upon you.

              I don't know which is increasing faster, memory or typical dataset size, but we have a mix of cultures and set-ups here. I've seen people describe datasets with thousands of observations as very big, which must cause wry if not slightly wicked amusement among people with millions and millions. Fortunately, people usually keep that to themselves.

              Comment


              • #8
                Thank you Joseph for helping out, the solution worked.
                Thanks Nick, I will keep it mind to store datetime variables in double format.

                Comment

                Working...
                X