Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting given dates (in string format) to stata internal format date

    Hi everyone,
    Please consider the following data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long card_no str9(preauth_date claim_date)
     977221 "07-Jun-07" "23-Jun-07"
     992838 "30-Apr-07" "07-May-07"
    1134065 "29-Sep-07" "22-Oct-07"
     978979 "25-Sep-07" "17-Oct-07"
     185079 "07-Sep-07" "09-Oct-07"
     236298 "17-Jun-07" "04-Jul-07"
      10695 "23-Aug-07" "29-Sep-07"
     991177 "12-Jul-07" "30-Jul-07"
     607283 "16-Apr-07" "08-May-07"
     992004 "02-May-07" "05-Jun-07"
     993019 "01-Jul-07" "30-Jul-07"
     942147 "21-Jul-07" "03-Aug-07"
     993060 "28-Jun-07" "14-Jul-07"
     629390 "29-Sep-07" "16-Nov-07"
     992576 "26-Nov-07" "14-Dec-07"
     991580 "25-Jun-07" "23-Jul-07"
     925376 "06-Sep-07" "11-Oct-07"
     992152 "15-Oct-07" "06-Nov-07"
     122292 "24-Nov-07" "15-Dec-07"
     991293 "05-Jun-07" "20-Jul-07"
     860282 "30-Jun-07" "20-Jul-07"
     990897 "06-Jun-07" "05-Jul-07"
     746860 "07-Oct-07" "20-Oct-07"
     993033 "19-Sep-07" "08-Oct-07"
     992107 "02-May-07" "12-Jun-07"
     290938 "22-May-07" "08-Jun-07"
      73535 "26-Nov-07" "17-Dec-07"
      13620 "26-Sep-07" "10-Dec-07"
     932454 "21-Jul-07" "03-Aug-07"
     992664 "06-May-07" "22-May-07"
     673806 "15-Jun-07" "03-Jul-07"
     991817 "28-Apr-07" "25-May-07"
     533467 "08-Jun-07" "29-Jun-07"
     203570 "14-Nov-07" "06-Dec-07"
     992176 "06-Jun-07" "10-Jul-07"
     619651 "07-Oct-07" "30-Oct-07"
     619651 "23-Sep-07" "05-Nov-07"
     246519 "02-Jun-07" "12-Dec-07"
      71513 "04-Aug-07" "13-Aug-07"
     992992 "01-Jul-07" "29-Aug-07"
     992190 "31-Oct-07" "06-Dec-07"
     991702 "09-May-07" "26-Jun-07"
     108848 "02-Jun-07" "06-Jul-07"
     863219 "17-Jul-07" "03-Aug-07"
     619651 "17-Sep-07" "09-Oct-07"
     724586 "03-May-07" "21-Jul-07"
     991513 "08-Sep-07" "08-Oct-07"
     978980 "02-Oct-07" "22-Oct-07"
     764766 "18-Jun-07" "14-Jul-07"
     991173 "23-Apr-07" "18-Jun-07"
     137826 "19-Nov-07" "18-Dec-07"
     499308 "21-Apr-07" "12-Jun-07"
     991204 "30-Apr-07" "11-Jun-07"
     236293 "23-Aug-07" "19-Sep-07"
     992262 "10-Jun-07" "28-Jul-07"
     991388 "24-Apr-07" "15-Jun-07"
     242941 "03-Aug-07" "06-Sep-07"
     880206 "17-Aug-07" "06-Sep-07"
     615769 "24-Nov-07" "29-Dec-07"
     992277 "28-Jun-07" "17-Aug-07"
     991021 "03-Jun-07" "28-Jun-07"
     992163 "08-Sep-07" "22-Oct-07"
     175543 "30-May-07" "03-Jul-07"
     991950 "05-May-07" "14-Jun-07"
     144112 "26-Oct-07" "22-Nov-07"
     221532 "03-Nov-07" "26-Nov-07"
     147468 "25-Jun-07" "16-Jul-07"
     992379 "16-May-07" "18-Jun-07"
     879588 "29-Jun-07" "16-Aug-07"
     263498 "31-Oct-07" "22-Nov-07"
     244775 "25-May-07" "03-Jul-07"
     142611 "01-Nov-07" "24-Dec-07"
     989288 "11-Oct-07" "31-Oct-07"
     499269 "18-Jul-07" "30-Aug-07"
     990915 "09-May-07" "05-Jul-07"
     992203 "09-Jun-07" "02-Jul-07"
     487467 "22-Aug-07" "19-Sep-07"
     494043 "23-Aug-07" "20-Sep-07"
     732697 "23-Jun-07" "07-Jul-07"
     887898 "20-Jul-07" "03-Sep-07"
     991509 "26-Oct-07" "26-Nov-07"
     992466 "18-Apr-07" "01-Jun-07"
     258721 "02-Sep-07" "28-Sep-07"
     129758 "06-Jun-07" "07-Jul-07"
     991890 "26-May-07" "29-Jun-07"
     992140 "16-Jun-07" "16-Aug-07"
     992108 "02-Nov-07" "26-Nov-07"
     360880 "03-Aug-07" "24-Aug-07"
     251546 "14-Aug-07" "08-Dec-07"
     516376 "29-May-07" "13-Jun-07"
     176467 "08-May-07" "22-Jun-07"
     619651 "13-Sep-07" "15-Oct-07"
     402573 "14-Jun-07" "03-Jul-07"
     992852 "05-Apr-07" "15-Jun-07"
     509464 "17-Aug-07" "10-Sep-07"
     992480 "27-Jun-07" "30-Jul-07"
     118692 "30-Oct-07" "14-Dec-07"
     445026 "19-Sep-07" "27-Oct-07"
      89624 "27-Apr-07" "28-May-07"
     991844 "19-Jun-07" "29-Aug-07"
    end
    I am trying to convert
    Code:
    preauth_date
    and
    Code:
    claim_date
    to stata internal format date with variable name
    Code:
    admitdate
    and
    Code:
    dischargedate
    respectively.
    Based on my understanding of the documentation, I tried the following code:
    Code:
    generate admitdate = date(preauth_date, "DMY")
    generate dischargedate = date(claim_date, "DMY")
    but missing values were generated
    Code:
    generate admitdate = date(preauth_date, "DMY")
    (2,150,865 missing values generated)
    
    . generate dischargedate = date(claim_date, "DMY")
    (2,150,865 missing values generated)
    Can anyone guide me as to the correct procedure?
    Any help would be greatly appreciated.


    Regards,
    Titir

  • #2
    Try the following, and for the other date too:

    Code:
    . gen statadate = date( preauth_date, "DM20Y")
    
    . format statadate %td

    Comment


    • #3
      The problem is the two digit years. So assuming it's 2007 not 1907, use "DM20Y" as the mask.

      Comment


      • #4
        In other words, you read the documentation fine, but the mask you are applying is incorrect because your year is missing the first two digits, so this is why you have to write in the mask DM20Y.

        Comment


        • #5
          WIth probability almost 1 Joro Kolev is giving exactly the right advice. But consider the following examples.


          Code:
          . di %td daily("13-apr-99", "DM20Y")
          13apr2099
          
          . di %td daily("13-apr-99", "DMY", 2021)
          13apr1999
          With Joro's code you're saying: all dates are from this century. If that's not right, the code is wrong for dates in any previous century. The second code goes further. Naturally, if you have data spanning more than a century, but no century information, it is hard to help.

          The data in #1 look medical to me and the premise is likely to be correct, but for other applications, watch out.

          ***

          There are two small date-related Tips embedded here.

          The first is to get into the habit of calling up daily() rather than date() to insist to yourself and others reading your code that daily dates are being produced. It's the same calculation, so why is that even worth mentioning? The name date() goes back to the time [so to speak] when daily dates were the only kind of date given special support in Stata, given that yearly dates are usually straightforward, modulo uses in cosmology, geology, archaeology and ancient history. Indeed. it matches much ordinary language, at least in English, in which "What is the date?" would be understood as "What is today's date?". Nevertheless there are now also half-yearly, quarterly, monthly and weekly dates, and sometimes date() is misunderstood as a generic function that is smart enough to look at your input and read your mind simultaneously to discern what you want, or what you really need.. Not so, which is why using a precise name does no harm and may help.

          The second is to use display (as shown, di works fine) to fire up examples to check that you have the right idea. This can be useful even if you (think you) have internalized much or even all of help datetime.

          Comment


          • #6
            This is very nice what you are showing with the top year, Nick. I have never done it this way, probably because the description in the help file is a bit abstract, and probably because this is a bit too powerful and general for what we typically need. When we write YY we are clearly skipping something implied, something understood by everybody. So I have always as a user assumed ownership of this and imputed the implied thing. But I could clearly see from the point of view of a programmer the advantages of having the top year syntax you showed.



            Originally posted by Nick Cox View Post
            WIth probability almost 1 Joro Kolev is giving exactly the right advice. But consider the following examples.


            Code:
            . di %td daily("13-apr-99", "DM20Y")
            13apr2099
            
            . di %td daily("13-apr-99", "DMY", 2021)
            13apr1999
            With Joro's code you're saying: all dates are from this century. If that's not right, the code is wrong for dates in any previous century. The second code goes further. Naturally, if you have data spanning more than a century, but no century information, it is hard to help.

            The data in #1 look medical to me and the premise is likely to be correct, but for other applications, watch out.

            ***

            There are two small date-related Tips embedded here.

            The first is to get into the habit of calling up daily() rather than date() to insist to yourself and others reading your code that daily dates are being produced. It's the same calculation, so why is that even worth mentioning? The name date() goes back to the time [so to speak] when daily dates were the only kind of date given special support in Stata, given that yearly dates are usually straightforward, modulo uses in cosmology, geology, archaeology and ancient history. Indeed. it matches much ordinary language, at least in English, in which "What is the date?" would be understood as "What is today's date?". Nevertheless there are now also half-yearly, quarterly, monthly and weekly dates, and sometimes date() is misunderstood as a generic function that is smart enough to look at your input and read your mind simultaneously to discern what you want, or what you really need.. Not so, which is why using a precise name does no harm and may help.

            The second is to use display (as shown, di works fine) to fire up examples to check that you have the right idea. This can be useful even if you (think you) have internalized much or even all of help datetime.

            Comment


            • #7
              Thank you Joro and Ali.. Your solutions worked wonderfully

              Comment

              Working...
              X