Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Caution using datetime (string conversion)

    Today I found a pitfall. Function <datetime> works as documented but can be misused. I usually work with dates, so familiar with
    g mydate = date(strvar, "MDY") // returns number of days
    but today was converting strings provided in an SPSS file and for once interested in differences of minutes. A typical SPSS value is
    "06/26/2025 06:55:08 PM" so I wrote
    g t1 = datetime(Start,"MDYhms") // returns number of milliseconds
    format t1 %tc
    but found most of the values of t1 were slightly out! After re-checking the documentation a couple of times realised that <datetime> returns values so large they have to be stored as double - as shown in the documentation and recent responses on this list.
    g double t1 = datetime(Start, "MDYhms") // type must be specified, and handles the AM/PM
    It seems to me this is an arithmetic overflow, or a type violation. Indeed, under <recast> there is a warning that the <force> option "makes recast unsafe" as a double forced into a float "would lead to a slight rounding of values". This looks like a default <force>.
    You have been warned!

  • #2
    I am not quite sure of what Allan is driving at here. First off, there is to my knowledge no Stata function datetime() : Software Possibly Suspect Somehow, I Be Mentioning?

    Otherwise agreed, you need to generate double to get correct values for large datetimes. Using clock() is not enough; the target must have enough bits to store values correctly.

    The point is mentioned, by Stata's count, 12 times in help datetime, but even very experienced Stata users, me too, sometimes miss what is in plain sight.

    Here is what I think Allan was intending (and apologies if I am missing something):

    Code:
    clear
    
    set obs 1 
     
    gen Start = "06/26/2025 06:55:08 PM" 
    gen t1 = clock(Start,"MDYhms") // returns number of milliseconds
    gen double t2 = clock(Start,"MDYhms") // returns number of milliseconds
    format t* %tc
    
    list 
    
         +------------------------------------------------------------------+
         |                  Start                   t1                   t2 |
         |------------------------------------------------------------------|
      1. | 06/26/2025 06:55:08 PM   26jun2025 18:54:17   26jun2025 18:55:08 |
         +------------------------------------------------------------------+

    Comment


    • #3
      Yes, precision issues can really bite. As noted, times where you are interested in seconds are one major pitfall; in my experience, long numeric IDs are another one.

      Comment


      • #4
        Nick Cox is right. I should have copied from the log file rather than my faulty memory. Function <clock> returns type <datetime>/

        gen t1 = clock(Start,"MDYhms") // returns number of milliseconds
        gen double t2 = clock(Start,"MDYhms") // returns number of milliseconds
        format t* %tc

        Comment

        Working...
        X