Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Handling time in stata

    Hey all

    I am having some trouble with transforming some of the time stamps in my data, they are currently formatted like this 21:40:56 and

    The variable appt_length (appointment length in seconds) (type= float format= %9.0g) was calculated in R using the variables consultant_started and consultant_ended (both are type =float, format= %tcHH:MM:SS)in my dataset

    however in some cases consultant_ended is missing. So i want to calculate and replace the missing values of appt_length (in seconds) using consultant_completed minus consultant_started

    code:
    gen appt_length_new = appt_length

    count if !missing(consultant_started) & missing(consultant_ended)

    replace appt_length_new = consultant_completed - consultant_started if missing(appt_length_new)

    ^ this doesn't work and transforms the times in a strange format (last line)

    however as you saw the appt_length variable was calculated using another software, so it isn't formatted as seconds in STATA.

    How can i
    1. Format the original appt_length variable in seconds correctly for stata (everything ive read mentions transformation from a string not from a float)
    2. Use the two new variables consultant_completed MINUS consultant_started to calculate and replace the missing values for appt_length in SECONDS
    3. Then create a new variable to transform appt_length into minutes.

    i hope thats all clear, my dissertation deadline is approaching and im struggling with the stata code - thank you in advance!


    code i used to transform the times

    gen consultant_started1 = clock(consultant_started,"hms")
    format consultant_started1 %tcHH:MM:SS

    Last edited by Nkira Anyika; 08 Nov 2020, 15:56.

  • #2
    Your question really isn't clear without more detail, or at a minimum it is too difficult to guess at a good answer from what you have shared. Please help us help you. Show example data. Show your code. Show us what precisely is wrong with your results. The Statalist FAQ provides advice on effectively posing your questions, posting data, and sharing Stata output..

    Comment


    • #3
      hey William Lisowski thanks for the response, ive added some code below

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float(appt_length consultant_started consultant_ended consultant_completed appt_length_new)
      2 77941000 77943000 78056000 2
      2 50702000 50704000 51073000 2
      2 61805000 61807000 62006000 2
      4 81730000 81734000  1194000 4
      4 64671000 64675000 64966000 4
      end
      format %tcHH:MM:SS consultant_started
      format %tcHH:MM:SS consultant_ended
      format %tcHH:MM:SS consultant_completed



      code:
      gen appt_length_new = appt_length

      count if !missing(consultant_started) & missing(consultant_ended)

      replace appt_length_new = consultant_completed - consultant_started if missing(appt_length_new)




      code i used to transform the times for consultant_started, consultant_ended and consultant_completed

      gen consultant_started1 = clock(consultant_started,"hms")
      format consultant_started1 %tcHH:MM:SS


      the issue is the times in the new column appt_length_new arent the same format as the rest of the dataset
      Last edited by Nkira Anyika; 08 Nov 2020, 16:01.

      Comment


      • #4
        Your problem is that for Stata, clock times are measured in milliseconds, so your appt_length must be multiplied by 1000, and then formatted using the same format as the other clock variables.
        Code:
        . replace appt_length = 1000 * appt_length
        (5 real changes made)
        
        . format %tcHH:MM:SS appt_length
        
        . list appt_length consultant_started consultant_ended, clean abbreviate(20) noobs
        
            appt_length   consultant_started   consultant_ended  
               00:00:02             21:39:01           21:39:03  
               00:00:02             14:05:02           14:05:04  
               00:00:02             17:10:05           17:10:07  
               00:00:04             22:42:10           22:42:14  
               00:00:04             17:57:51           17:57:55
        Let me add that all clock times should be created a double rather than float to ensure full precision.
        Code:
        gen double consultant_started1 = clock(consultant_started,"hms")
        Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

        All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

        Comment


        • #5
          Thank you so much William, yes I have read the documentation but STATA isn't the best at guiding you when you get thrown an error (well for a newish user) - python is amazing at this.
          The first code worked perfectly

          I ran this code
          Code:
          gen double consultant_started1 = clock(consultant_started,"hms") and got thrown a type mismatch error

          type mismatch
          r(109);

          error . . . . . . . . . . . . . . . . . . . . . . . . Return code 109
          type mismatch
          In an expression, you attempted to combine a string and numeric
          subexpression in a logically impossible way. For instance, you
          attempted to subtract a string from a number or you attempted
          to take the substring of a number.


          I then tried

          Code:

          gen double consultant_started1= consultant_started

          replace consultant_started1 = clock(consultant_started,"hms") - and i get thrown the same error

          i then added
          replace consultant_started1 = clock(consultant_started,"DMYhms") - because when i click on my dates its stored as [01jan1960 20:25:27]


          ^ With this information i have no idea how to move forward, i've done a search and the examples are about dates not sure why time isn't covered much

          Comment


          • #6
            ok so i ended up doing this

            gen double consultant_started1 = consultant_started
            format consultant_started1 %tcHH:MM:SS

            and it worked but i haven't used the clock function ?

            Comment


            • #7
              In post #1 you told us you did
              Code:
              gen consultant_started1 = clock(consultant_started,"hms")
              format consultant_started1 %tcHH:MM:SS
              In post #2 I told you what that should have been. But meanwhile you have apparently replaced the string variable consultant_started with the numeric variable you created as consultant_started1, so the clock function could not work, because it converts a string to a numeric clock value.

              Here are three pieces of advice.

              1) Your question really isn't clear without more detail, or at a minimum it is too difficult to guess at a good answer from what you have shared. Please help us help you. Show example data - the original data, not the data you've already altered.. Show the code you apply to that data. Show us what Stata told you. Tell us what precisely is wrong. The Statalist FAQ provides advice on effectively posing your questions, posting data, and sharing Stata output..

              2) Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

              3 ) I'm sympathetic to you as a new user of Stata - there is quite a lot to absorb.

              When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

              All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

              Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

              When you have a better understanding of Stata you will have a better understanding of the error messages it throws.
              Last edited by William Lisowski; 09 Nov 2020, 05:25.

              Comment


              • #8
                Ahhh ok that makes sense, thank you William for the advice! A lot of documentation causes the same problems as no documentation at all.
                As in both situations you spend hours trying to find the answer to a seemingly simple question - i have been searching for a while for the solution.

                I am making progress slowly but surely with your help. So as you can see i had to go back clean my data all over again because of your feedback.
                The last problem i'm having is below

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str7 appt_length float appt_length2 double appt_length3
                "143"   143 315619343000
                "862"   862 315620062000
                "991"   991 315620191000
                "1065" 1065 315620265000
                "329"   329 315619529000
                end
                format %tcHH:MM:SS appt_length3

                CODE:
                gen appt_length2 = real(appt_length) - string to numeric

                generate double appt_length3 = appt_length2* 1000 + mdyhms(1,1,1960,0,0,0) - numeric to double (from helpdatetime - this was difficult to interpret)

                format %tcHH:MM:SS appt_length3



                if you look at the first row- 143 seconds (appt_length) does not equal 00:02:23 (appt_length3) so i dont know why STATA is giving me this value when it should be 00:02:38.
                in my original data i start with a str so i can't apply the below code you provided, it also give me the output in a float not a double.
                replace appt_length = 1000 * appt_length
                format %tcHH:MM:SS appt_length.

                Comment


                • #9
                  if you look at the first row- 143 seconds (appt_length) does not equal 00:02:23 (appt_length3) ... it should be 00:02:38.
                  143 seconds is equal to 2 minutes and 23 seconds, which is displayed as 00:02:23 in hours-minutes-seconds format, or it is equal to 2 and 23/60 minutes, which would be displayed as 2.38 if you converted the times from seconds to minutes and displayed them as a number, as in the example below.

                  Code:
                  . generate appt_length_m = appt_length2/60
                  
                  . format %9.2f appt_length_m
                  
                  . list, abbreviate(20)
                  
                       +-----------------------------------------------------------+
                       | appt_length   appt_length2   appt_length3   appt_length_m |
                       |-----------------------------------------------------------|
                    1. |         143            143       00:02:23            2.38 |
                    2. |         862            862       00:14:22           14.37 |
                    3. |         991            991       00:16:31           16.52 |
                    4. |        1065           1065       00:17:45           17.75 |
                    5. |         329            329       00:05:29            5.48 |
                       +-----------------------------------------------------------+

                  Comment


                  • #10
                    Thank you for your help

                    Comment

                    Working...
                    X