Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Time variable: string to real time

    I have a variable that records the time of an event. The problem is that some of entries include string reposes such N/A, which make the variable string when imported to Stata. How can I fix this?

    Code:
    * Example generated    by    -dataex-.    To    install:    ssc    install    dataex
    clear
    input str8 alarm
    "25:00"
    "21:05:00"
    "08:37:00"
    "00:00:00"
    "14:05:00"
    "03:10:00"
    "09:14:00"
    "11:46:00"
    "12:06:00"
    "n/a"
    end
    Also, how can I deal with the first observation. I assume it should be 25:00:00 and not 25:00

    Thanks,
    Marvin

  • #2
    Yes, date and time variables can be very complicated--people record them in all sorts of ways. As far as I know, you just have to first code for the most common formatting in the data, and then slog through the exceptions one at a time. For example, the following works for the data you posted:

    Code:
    * Example generated    by    -dataex-.    To    install:    ssc    install    dataex
    clear
    input str8 alarm
    "25:00"
    "21:05:00"
    "08:37:00"
    "00:00:00"
    "14:05:00"
    "03:10:00"
    "09:14:00"
    "11:46:00"
    "12:06:00"
    "n/a"
    end
    
    gen double time = clock(alarm, "hms")
    replace time = clock(alarm, "ms") if missing(time)
    format time %tcHH:MM:SS
    
    list, noobs clean
    You may have to add other -replace- statements as well (always with -if missing(time)-) to fill out all the possibilities. I have often had to use half-a-dozen or more lines of code to deal with date time variables that originated in as strings from Excel or in handwritten records.

    Now, you said you assume that 25:00 should be 25:00:00--but it can't be! There is no 25th hour in the clock. Either it is a flat-out error, or it should be 00:25:00 (which is what I set it as in the code above.) But, there is the question of how to handle an entry like "11:00" if you get one. It could be "11:00:00" or it could be "00:11:00". You will need to either go back to the originator of the source data and ask for clarification on those items, or if that is not feasible, you will have to make some kind of rule and write the code in such a way as to apply it.

    Whatever, you do, at least while you are developing the code, before it goes into "production runs," you should follow the block of -gen- and -replace- statements with

    Code:
    list alarm if missing(time) & !missing(alarm)
    This will enable you to see if there are any more aberrantly recorded values of alarm that you haven't yet converted to a real time. When you've got it right, the listing will produce only things like "N/A" "n/a" "???" "", etc. that clearly cannot be times.

    Comment


    • #3
      Thank you so much! I have some experience converting string dates to real dates but it is the first time I work with a string time.

      Thank you,
      Marvin

      Comment


      • #4
        Hi Clyde Schechter ,

        What quick question. When you said:
        list alarm if missing(time) & !missing(alarm)

        it means?
        list alarm if time != alarm


        missing(time) means if time==. ?

        Comment


        • #5
          The two statements are not the same. Two variables could differ and neither could be missing. 42 and 666 are different and neither is missing.

          Clyde is checking if time is missing but alarm wasn't; that would be cause for alarm.

          The syntax missing() should suggest to you a function. Even if it doesn't you can go

          Code:
           
          help missing
          which explains. The function missing() is more general than just checking for numeric so-called system missing as with strings missing() is true for empty strings and with numeric arguments it is true for .a to .z.

          Comment

          Working...
          X