Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can't get consistent behavior from %td and %tc dates

    Hello,

    I have a date variable that contains hms and am trying to use it to identify the season in which the dates are. Here are some sample data.
    Code:
    clear
    input str18 startdate
    "27jun2017 17:41:17"
    "14jun2017 17:31:28"
    "11mar2019 20:13:10"
    "06sep2017 14:26:16"
    "16jun2017 16:00:39"
    "03may2019 10:00:07"
    "20jun2017 11:01:14"
    "03may2018 09:50:13"
    "14nov2018 11:46:36"
    "30may2019 11:34:01"
    "16jun2017 09:57:07"
    "23sep2017 17:49:21"
    "07sep2017 19:10:58"
    "25jan2018 09:29:11"
    "27jun2017 20:33:07"
    end
    If I split the date from the time, I am able to identify the seasons with the following code.

    Code:
    split startdate
    gen date1 = date(startdate1,"DMY")
    format date1 %td
    
        gen season1 = .
    replace season1 = 1 if inrange(date1,mdy(3,21,year(date1)),mdy(6,20,year(date1))) 
    replace season1 = 2 if inrange(date1,mdy(6,21,year(date1)),mdy(9,21,year(date1)))
    replace season1 = 3 if inrange(date1,mdy(9,22,year(date1)),mdy(12,20,year(date1)))
    replace season1 = 4 if date1 >= mdy(12,21,year(date1)) | date1 <= mdy(3,20,year(date1))
    That gives me what I want but I can't figure out why the following doesn't produce the same result.

    Code:
    gen date2 = clock(startdate1,"DMY hms")
    format date2 %tc
    
        gen season2 = .
    replace season2 = 1 if inrange(date2,mdy(3,21,year(dofc(date2))),mdy(6,20,year(dofc(date2)))) 
    replace season2 = 2 if inrange(date2,mdy(6,21,year(dofc(date2))),mdy(9,21,year(dofc(date2))))
    replace season2 = 3 if inrange(date2,mdy(9,22,year(dofc(date2))),mdy(12,20,year(dofc(date2))))
    replace season2 = 4 if date2 >= mdy(12,21,year(dofc(date2))) | date2 <= mdy(3,20,year(dofc(date2)))
    From what I understand, the dofc function should change the date information of date2 into the date type rather than the date-clock type and then the year function should extract the year just like it does in the first set of code. I haven't been able to figure it out what I'm missing from the resources I've been able to get my hands on. Since I can get what I need from the first set of code, I'm hoping to better understand the date-time functions, which I've always struggled with.

    Thanks,
    Lance

  • #2
    Your command -gen date2 = clock(startdate1,"DMY hms")- is wrong in two ways.

    First, the variable startdate1, which arose from the -split- command, does not in fact contain any information about hours, minutes, or seconds. Consequently, the value of date2 is missing in every observation. You probably meant to write -gen date2 = clock(startdate,"DMY hms")- That's still wrong because the default storage type with -gen- is float, and a float does not have enough bits to hold all the information in a clock variable. So it needs to be -gen double date2 = clock(startdate,"DMY hms")-. (I suspect this float vs double issue doesn't actually affect the problem you're having--but it would probably give you screwy results later on if you tried to actually use the hours minute and seconds for something.)

    Then we come to another difficulty altogether. All your -replace season2 = - commands are wrong. The variable date2, being a clock variable, is denominated in milliseconds. By contrast, the results of mdy(), being dates, not clocks, are denominated in days. So it's like you are trying to compare something that's measured in millimeters with something that's measured in kilometers without accounting for the difference in units.

    If you fix all of this up you get consistent and correct results.

    Code:
    split startdate
    gen date1 = date(startdate1,"DMY")
    format date1 %td
    
    
        gen season1 = .
    replace season1 = 1 if inrange(date1,mdy(3,21,year(date1)),mdy(6,20,year(date1)))
    replace season1 = 2 if inrange(date1,mdy(6,21,year(date1)),mdy(9,21,year(date1)))
    replace season1 = 3 if inrange(date1,mdy(9,22,year(date1)),mdy(12,20,year(date1)))
    replace season1 = 4 if date1 >= mdy(12,21,year(date1)) | date1 <= mdy(3,20,year(date1))
    
    gen date2 = clock(startdate,"DMY hms")
    format date2 %tc
    
        gen season2 = .
    replace season2 = 1 if inrange(dofc(date2),mdy(3,21,year(dofc(date2))),mdy(6,20,year(dofc(date2))))
    replace season2 = 2 if inrange(dofc(date2),mdy(6,21,year(dofc(date2))),mdy(9,21,year(dofc(date2))))
    replace season2 = 3 if inrange(dofc(date2),mdy(9,22,year(dofc(date2))),mdy(12,20,year(dofc(date2))))
    replace season2 = 4 if dofc(date2) >= mdy(12,21,year(dofc(date2))) | dofc(date2) <= mdy(3,20,year(dofc(date2)))
    
    assert season1 == season2

    Comment


    • #3
      Clyde,

      Yes, the first mistake you identified, using startdate1, was the result of a careless, last-minute edit to my original code. Sorry about that...I should have known better.

      It is the second issue that was, I think, what I was missing (of course, in addition to needing to make the variable a double). I appreciate your help making me see what I couldn't see before.

      Best,
      Lance

      Comment

      Working...
      X