Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help needed with a tricky time variable problem

    Hello, gang

    I was wondering if someone could help me with a problem that has been bugging me for quite some time. Can't seem to find any sources on where to find it explained specifically (i've read through most i can find ++ in 24: Working with dates and times). If anyone knows where i could find a document that gives an explanation towards solving my issue, i am, of course, more than happy to read it.



    I got a date variable called "date" which shows me the timestamp for when a respondent finished answering a survey. It is displayed in a string with m/d/y/h/m. Example: "12/26/2022 08:43". I generated a numeric "version" of this time variable by using the code

    "generate double timestamp = clock(date," MDY hm")"

    and then

    "format timestamp %tc" in order to make it more managable.

    What I would like to do is to create a variable that just shows the literal time (f.ex 08:43) and then group every timestamp that is within the hours 00:00 - 00:59, 01:00 - 01:59... and so on. F.ex giving the time interval 00:00 - 00:59 the value 0 and 01:00 - 01:59 the value 1... and so on. My goal is to run a regression to see if answering the survey at different times affects the answers. Also, if its possible, I would like to do a variable for different days of the week (monday - sunday), what week it is (01.01 - 07.01 = week 1), month and year.

    Would greatly appreciate any help!

    Kind regards

    Tor

  • #2
    Here is some example code that may start you in a useful direction.
    Code:
    input str20 date
    "12/26/2022 08:43"
    end
    generate double timestamp = clock(date,"MDY hm")
    format timestamp %tc
    generate hour = hh(timestamp)
    generate day = dow(dofc(timestamp))
    generate month = month(dofc(timestamp))
    Code:
    . list, clean noobs
    
                    date            timestamp   hour   day   month  
        12/26/2022 08:43   26dec2022 08:43:00      8     1      12
    Week of the year is not usually as as useful as we might hope: a year has 1 or 2 days more than 52 full weeks, and both 1 January and 31 December can fall on midweek days, so partial weeks are a substantial problem.

    Comment


    • #3
      This is exactly what i'm looking for. As always, William, thank you so much for your help!

      Follow up question: Would it be possible to make the "hour" variable into 30 mins/half-hours? F.ex 00:00 - 00:29 = 1, 00:30 - 00:59 = 2... and so on.

      Comment


      • #4
        Sounds like a simple arithmetic problem. You cannot know anything about minutes from just looking at the hour, so if you consider minutes, you have to generate such a variable as well.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str20 date
        "12/26/2022 01:23"
        "12/26/2022 08:43"
        end
        
        gen hour=  real(substr(date, -5, 2))
        gen minute=  real(substr(date, -2, 2))
        gen wanted= (minute>29)+ (2*hour) + 1
        Res.:

        Code:
        . l
        
             +-------------------------------------------+
             |             date   hour   minute   wanted |
             |-------------------------------------------|
          1. | 12/26/2022 01:23      1       23        3 |
          2. | 12/26/2022 08:43      8       43       18 |
             +-------------------------------------------+
        Last edited by Andrew Musau; 03 Jan 2023, 01:05.

        Comment


        • #5
          Thank you, Andrew. I should have specified my question better and not written that i wanted to make the "hour" variable into shorter time intervals. What I meant to say was if it was possible to make the "timestamp" variable into shorter time intervals than what it is now (as shown by William in the "hour" variable). Now it is 24 time intervals defined by the hours of the day, but in the same fashion make the "timestamp" variable into 48 time intervals. But, making it into shorter intervals is not really that important for the analysis, it was just to see if it was possible. Thank you both for your time and help! Really appreciate it!

          Comment


          • #6
            This may be idiosyncratic, but it yields to documented functions. Here is one way to do it.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input str20 date
            "12/26/2022 01:23"
            "12/26/2022 08:43"
            "1/1/2023 16:07"
            "1/1/2023 23:45"
            end
            
            gen double datetime = clock(date, "MDY hm")
            format datetime %tc 
            gen double timeofday = mod(datetime, 60000 * 60 * 24)
            format timeofday %tcHH:MM 
            gen wanted = ceil(timeofday / (60000 * 30))
            
            list 
            
                 +-----------------------------------------------------------+
                 |             date             datetime   timeof~y   wanted |
                 |-----------------------------------------------------------|
              1. | 12/26/2022 01:23   26dec2022 01:23:00      01:23        3 |
              2. | 12/26/2022 08:43   26dec2022 08:43:00      08:43       18 |
              3. |   1/1/2023 16:07   01jan2023 16:07:00      16:07       33 |
              4. |   1/1/2023 23:45   01jan2023 23:45:00      23:45       48 |
                 +-----------------------------------------------------------+

            Further, you can create your own unusual formats:


            Code:
            . gen WANTED = strofreal(dofc(datetime), "%td") + " " + strofreal(wanted, "%02.0f")
            
            . l
            
                 +--------------------------------------------------------------------------+
                 |             date             datetime   timeof~y   wanted         WANTED |
                 |--------------------------------------------------------------------------|
              1. | 12/26/2022 01:23   26dec2022 01:23:00      01:23        3   26dec2022 03 |
              2. | 12/26/2022 08:43   26dec2022 08:43:00      08:43       18   26dec2022 18 |
              3. |   1/1/2023 16:07   01jan2023 16:07:00      16:07       33   01jan2023 33 |
              4. |   1/1/2023 23:45   01jan2023 23:45:00      23:45       48   01jan2023 48 |
                 +--------------------------------------------------------------------------+

            Comment


            • #7
              This supplied the last two things i was wondering about. Really appreciate it, Nick, thank you!

              Comment


              • #8
                I've got a lot of good work out of mod(), floor() and ceil().

                Comment


                • #9
                  I am absolutely certain I will too

                  Comment


                  • #10
                    I hope you have confirmed that the timestamp data you are starting with reflects the local date and time of the survey respondent (including daylight savings time adjustment if necessary), and not the local date and time at the server that was recording the responses from respondents in one or more different time zones.

                    Comment


                    • #11
                      Yes, the respondents are all from the same country/time zone.

                      Comment


                      • #12
                        Sorry to be bothering you guys again, but i found myself in bit of a new pickle regarding some unexpected formatting when I imported a "similar" dataset into stata. Since it is related to the same topic i didnt make another thread.

                        The date and time for when a respondent finishes a survey is now separated as shown below:
                        ID date time
                        1 12/12/2022 30/12/1899 12:28:00
                        2 12/12/2022 30/12/1899 09:20:00
                        3 05/04/22 30/12/1899 10:00:00
                        4 06/03/2022 30/12/1899 15:37:00
                        5 06/03/2022 30/12/1899 11:52:00
                        As you can see "date" shows the correct date for when the respondent finished the survey and "time" shows a strange date, but with the correct time of day.

                        What i would like to do is to put the correct date and correct time together, and then do the same process as described by my initial problem: separating hours, days etc.

                        Anyone that know what i can do to fix this?

                        Tor

                        Comment


                        • #13
                          If the time of day is accurate and what you want, then #7 already includes most of an answer.

                          Did you notice the use of dataex in #4 and #7? It really is helpful to give us an example using it. See FAQ Advice #12.

                          The example in post #12 is ambiguous on whether your dates are MDY or DMY but in earlier posts you had MDY, But here your time of day variable is manifestly DMY hms.

                          You need to check carefully whether date in #12 is really MDY or DMY.

                          Code:
                          * Example generated by -dataex-. For more info, type help dataex
                          clear
                          input byte id str10 date str19 time
                          1 "12/12/2022" "30/12/1899 12:28:00"
                          2 "12/12/2022" "30/12/1899 09:20:00"
                          3 "05/04/22"   "30/12/1899 10:00:00"
                          4 "06/03/2022" "30/12/1899 15:37:00"
                          5 "06/03/2022" "30/12/1899 11:52:00"
                          end
                          
                          gen double DATE = clock(date, "DMY")
                          replace DATE = clock(date, "DM20Y") if missing(DATE)
                          
                          gen double timeofday = mod(clock(time, "DMY hms"), 60000 * 60 * 24)
                          
                          gen double wanted = DATE + timeofday 
                          format wanted %tc 
                          
                          list date time wanted 
                          
                               +-------------------------------------------------------+
                               |       date                  time               wanted |
                               |-------------------------------------------------------|
                            1. | 12/12/2022   30/12/1899 12:28:00   12dec2022 12:28:00 |
                            2. | 12/12/2022   30/12/1899 09:20:00   12dec2022 09:20:00 |
                            3. |   05/04/22   30/12/1899 10:00:00   05apr2022 10:00:00 |
                            4. | 06/03/2022   30/12/1899 15:37:00   06mar2022 15:37:00 |
                            5. | 06/03/2022   30/12/1899 11:52:00   06mar2022 11:52:00 |
                               +-------------------------------------------------------+
                          Everything I know here I know because I studied
                          Code:
                          help datetime

                          Comment


                          • #14
                            Again, sorry to be bothering you, but i figured it out. I'll post my solution in case anyone else drops into this thread and has the same problem. The code i used was:

                            gen double datetime = clock(time, "DMYhms")
                            format datetime %tc
                            gen double timeofday = mod(datetime, 60000 * 60 * 24)
                            format timeofday %tcHH:MM
                            generate eventdate = date(date, "DMY")
                            format eventdate %td
                            gen double datetime2 = eventdate*24*60*60*1000 + timeofday
                            format datetime2 %tcNN/DD/CCYY_HH:MM:SS

                            Comment


                            • #15
                              Of course, just when i posted i see you post the solution, Nick. I apologize for taking your time

                              Comment

                              Working...
                              X