Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Manage dates with different digits

    Dear Stata users,

    I'm using the Stata 17 MP and want to manipulate dates with different digits (see below).

    In the first row, for instance, I have "9012002" (7 digits), meaning 09dec2002. In the third row, I have "29052000" (8 digits), which is 29mai2000.

    I tried to transform the dt_admissao variables in many ways, but unfortunately, I could not obtain operationalizable dates. Could you help me solve my problem?

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str8 dt_admissao
    "9012002"
    "1022002"
    "29052000"
    "17122001"
    "1022002"
    "1032000"
    "1082001"
    "1092002"
    "1081995"
    "1032001"
    end

  • #2
    How did you try and transform it

    Comment


    • #3
      Thank you for the example data presented with the dataex command.

      The following code, which counts character positions backward from the end of the string, will I hope start you in a useful direction. Because the day at the beginning of the string may either be one or two characters long, I append a leanding "0" to be sure there are 8 characters when counting backwards. The end result is a Stata daily date as expected by Stata commands that deal with daily data.
      Code:
      generate day   = real(substr("0"+dt_admissao,-8,2))
      generate month = real(substr(dt_admissao,-6,2))
      generate year  = real(substr(dt_admissao,-4,4))
      generate date = mdy(month,day,year)
      format date %td
      list, clean abbreviate(12)
      Code:
      . list, clean abbreviate(12)
      
             dt_admissao   day   month   year        date  
        1.       9012002     9       1   2002   09jan2002  
        2.       1022002     1       2   2002   01feb2002  
        3.      29052000    29       5   2000   29may2000  
        4.      17122001    17      12   2001   17dec2001  
        5.       1022002     1       2   2002   01feb2002  
        6.       1032000     1       3   2000   01mar2000  
        7.       1082001     1       8   2001   01aug2001  
        8.       1092002     1       9   2002   01sep2002  
        9.       1081995     1       8   1995   01aug1995  
       10.       1032001     1       3   2001   01mar2001
      With that said, a quick piece of advice in case you're new to using Stata and unfamiliar with the way Stata deals with dates and times. Stata's "date and time" variables are complicated and there is a lot to learn. If you have not already read the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF, do so now. If you have, it's time for a refresher. After that, the help datetime documentation will usually be enough to point the way. You can't remember everything; even the most experienced users end up referring to the help datetime documentation or back to the manual for details. But at least you will get a good understanding of the basics and the underlying principles. An investment of time that will be amply repaid.

      All Stata manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

      And my apologies if you knew all that already.

      Comment


      • #4
        Another way would be

        Code:
        . replace dt_admissao = "0" + dt_admissao if length( dt_admissao)<8
        (8 real changes made)
        
        . gen statadate = date( dt_admissao, "DMY")
        
        . format statadate %td
        
        . list
        
             +----------------------+
             | dt_adm~o   statadate |
             |----------------------|
          1. | 09012002   09jan2002 |
          2. | 01022002   01feb2002 |
          3. | 29052000   29may2000 |
          4. | 17122001   17dec2001 |
          5. | 01022002   01feb2002 |
             |----------------------|
          6. | 01032000   01mar2000 |
          7. | 01082001   01aug2001 |
          8. | 01092002   01sep2002 |
          9. | 01081995   01aug1995 |
         10. | 01032001   01mar2001 |
             +----------------------

        Comment


        • #5
          A variant on #4

          Code:
          gen wanted = daily(cond(length(dt_) < 8, "0" + dt_, dt_), "DMY")
          
          format wanted %td

          Another technique

          Code:
          gen y = substr(dt_, -4, 4)
          gen m = substr(dt_, -6, 2)
          gen d = subinstr(dt_, m + y, "", .)
          
          gen wanted2 = mdy(real(m), real(d), real(y))
          Last edited by Nick Cox; 26 Dec 2022, 04:06.

          Comment


          • #6
            Thank you, William, Joro, Nick, and Jared! It worked! I used Nick's code since it was straightforward. Yes, William, I will reread Chapter 24 as I need a refresher, and I still have to do many date operations in the coming weeks.

            Comment

            Working...
            X