Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata is changing the year from 2022 to 1963

    I'm working with a series of dates and times and my code is changing the year from 2022 to 1962.

    I have date and time as two separate variable:

    format closest_date %tdnn/DD/YY
    format closest_time %tcHH:MM:SS


    I'm using this code to merge date and time into one variable:

    gen double closest_dts = closest_date*24*60*60*1000 + closest_time
    format closest_dts %tcnn/DD/YY_HH:MM:SS

    gen double datetime=date*24*60*60*1000+time
    format datetime %tcDD/NN/CCYY_HH:MM:SS


    closest_date closest_time closest_dts
    1/04/2023 14:44:03 1/04/1963 14:44:03

    Is anyone able to advise where I'm going wrong with my code?

    Thanks,

    Katie







  • #2
    Stata's internal time formats are defined relative to January 1, 1960. Your code references "dates" but in actuality is converting "days", as an offset, into datetimes (i.e., computing the number of milliseconds that have passed). Please post a data example using -dataex- and the exact code you are using so we can get a better sense of how to help you.

    Comment


    • #3
      I cannot replicate your problem based on my understanding of what you show in #1. I do note that your -gen double datetime- command refers to variables date and time that do not appear to exist. But on the assumption that you meant closest_date and closest_time, respectively, everything runs properly:
      Code:
      . clear*
      
      . set obs 1
      Number of observations (_N) was 0, now 1.
      
      . gen closest_date = td(4jan2023)
      
      . format closest_date %tdnn/DD/YY
      
      .
      . gen closest_time = tc(1jan1960 14:44:03)
      
      . format closest_time %tcHH:MM:SS
      
      .
      . list, noobs clean abbrev(12)
      
          closest_date   closest_time  
               1/04/23       14:44:03  
      
      .
      . gen double closest_dts = closest_date*24*60*60*1000 + closest_time
      
      . format closest_dts %tcnn/DD/YY_HH:MM:SS
      
      .
      . gen double datetime=closest_date*24*60*60*1000+closest_time
      
      . format datetime %tcDD/NN/CCYY_HH:MM:SS
      
      .
      . list,noobs clean abbrev(12)
      
          closest_date   closest_time        closest_dts              datetime  
               1/04/23       14:44:03   1/04/23 14:44:03   04/01/2023 14:44:03
      I endorse Leonardo Guizzetti's advice to post back with a -dataex- example that demonstrates the problem you are having, along with the exact code you are using.

      Comment


      • #4
        He's a section of the data, the raw data is in excel and the date is formatted correctly in excel.

        Any advice greatl appreciated.




        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int closest_date double closest_time
        21710      -1893420233000
        21922      -1893412787000
        21715      -1893381282000
        21705 -1893430122000.0002
        21705 -1893396238999.9998
        21717 -1893404991000.0002
        21717      -1.8934183e+12
        21710 -1893415076000.0002
        21701 -1893377421000.0002
        21736 -1893382617000.0002
        end
        format %tdnn/DD/YY closest_date
        format %tcHH:MM:SS closest_time


        Code:
        gen double closest_dts = closest_date*24*60*60*1000 + closest_time
         
        format closest_dts %tcnn-DD-YY_HH:MM:SS


        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int closest_date double(closest_time closest_dts)
        21710      -1893420233000        -17676233000
        21922      -1893412787000           648013000
        21715      -1893381282000        -17205282000
        21705 -1893430122000.0002 -18118122000.000244
        21705 -1893396238999.9998 -18084238999.999756
        21717 -1893404991000.0002 -17056191000.000244
        21717      -1.8934183e+12        -1.70695e+10
        21710 -1893415076000.0002 -17671076000.000244
        21701 -1893377421000.0002 -18411021000.000244
        21736 -1893382617000.0002 -15392217000.000244
        end
        format %tdnn/DD/YY closest_date
        format %tcHH:MM:SS closest_time
        format %tcnn-DD-YY_HH:MM:SS closest_dts

        . li closest_date closest_time closest_dts in 1/10


        +----------------------------------------+
        | cl~_date close~me closest_dts |
        |----------------------------------------|
        1. | 6/10/19 09:56:07 6-10-59 09:56:07 |
        2. | 1/08/20 12:00:13 1-08-60 12:00:13 |
        3. | 6/15/19 20:45:18 6-15-59 20:45:18 |
        4. | 6/05/19 07:11:17 6-05-59 07:11:17 |
        5. | 6/05/19 16:36:01 6-05-59 16:36:01 |
        |----------------------------------------|
        6. | 6/17/19 14:10:08 6-17-59 14:10:08 |
        7. | 6/17/19 10:28:20 6-17-59 10:28:20 |
        8. | 6/10/19 11:22:03 6-10-59 11:22:03 |
        9. | 6/01/19 21:49:38 6-01-59 21:49:38 |
        10. | 7/06/19 20:23:02 7-06-59 20:23:02 |
        +----------------------------------------+

        Comment


        • #5
          OK, thanks for the -dataex-. The problem is that the variable closest_time is not what you think it is. Because it was formatted %tcHH:MM:SS, you, rather naturally, assumed that it contains the time as part of a Stata datetime variable with the date part set to zero (which corresponds, in Stata, to 1 Jan 1960).

          But that is not what it is. If you change the display format so that you can see the date part:
          Code:
          %tcHH:MM:SS closest_time
          
          .
          . gen double closest_dts = closest_date*24*60*60*1000 + closest_time
          
          .  
          . format closest_dts %tcnn-DD-YY_HH:MM:SS
          
          .
          . li closest_date closest_time closest_dts in 1/10
          
               +----------------------------------------+
               | close~te   close~me        closest_dts |
               |----------------------------------------|
            1. |  6/10/19   09:56:07   6-10-59 09:56:07 |
            2. |  1/08/20   12:00:13   1-08-60 12:00:13 |
            3. |  6/15/19   20:45:18   6-15-59 20:45:18 |
            4. |  6/05/19   07:11:17   6-05-59 07:11:17 |
            5. |  6/05/19   16:36:01   6-05-59 16:36:01 |
               |----------------------------------------|
            6. |  6/17/19   14:10:08   6-17-59 14:10:08 |
            7. |  6/17/19   10:28:20   6-17-59 10:28:20 |
            8. |  6/10/19   11:22:03   6-10-59 11:22:03 |
            9. |  6/01/19   21:49:38   6-01-59 21:49:38 |
           10. |  7/06/19   20:23:02   7-06-59 20:23:02 |
               +----------------------------------------+
          
          .
          end of do-file
          
          . format closest_time %tc
          
          . list closest_time
          
               +--------------------+
               |       closest_time |
               |--------------------|
            1. | 31dec1899 09:56:07 |
            2. | 31dec1899 12:00:13 |
            3. | 31dec1899 20:45:18 |
            4. | 31dec1899 07:11:17 |
            5. | 31dec1899 16:36:01 |
               |--------------------|
            6. | 31dec1899 14:10:08 |
            7. | 31dec1899 10:28:20 |
            8. | 31dec1899 11:22:03 |
            9. | 31dec1899 21:49:38 |
           10. | 31dec1899 20:23:02 |
               +--------------------+
          31 Dec 1899 is Excel's reference date, and Stata, during importation, did not seem to realize what was going on and failed to change that. Looking also just at the numerical values of closest_time in the -dataex- output you can also see that they are large negative numbers, not small positive numbers. You could resolve this by adding the difference between 1 jan 1960 and 31 dec 1899 (in milliseconds) to closest_time. But that will be a bizarre magic number, and the code will not be readily understandable to others, and probably not to you 6 months hence. So I would do it with transparent Stata datetime functions:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input int closest_date double closest_time
          21710      -1893420233000
          21922      -1893412787000
          21715      -1893381282000
          21705 -1893430122000.0002
          21705 -1893396238999.9998
          21717 -1893404991000.0002
          21717      -1.8934183e+12
          21710 -1893415076000.0002
          21701 -1893377421000.0002
          21736 -1893382617000.0002
          end
          format %tdnn/DD/YY closest_date
          format %tcHH:MM:SS closest_time
          
          
          gen double closest_dts = cofC(Cdhms(closest_date, hh(closest_time), mm(closest_time), ///
              ss(closest_time)))
          format closest_dts %tcnn-DD-YY_HH:MM:SS
          
          list, noobs clean

          Comment


          • #6
            I've fixed the problem!

            Having said that the data was in the right format in excel, I hadn't realised that when I imported the date/time variables into Stata the format was changing. I only realised when I looked at the data in browse.

            To get round the problem I've mported the date/time variables as string into Stata and I've converted them into date/time double variables using the Stata.

            Thank you for the advice above.

            Many thanks,

            Katie

            Comment

            Working...
            X