Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Please help me troubleshoot converting string date("20100201") into the format Stata recognizes as date

    Hi, I was converting string date data (variable Fdate and Pdate, they look like "20100201") into numeric data.

    Code:
    clear
    input str6  FirmNo Yr Fdate Pdate
     002410 2010 20100120 20100221
     002410 2011 20110121 20110222
     002410 2012 20120124 20120222
     002319 2010 20100127 20100219
     002319 2011 20110124 20110220
     002319 2012 20120119 20120222
    002590 2010 20100201 20100209 #this observation is important!
    end
    Since Fdate and Pdate are string data, I used real() to turn them into numeric data.

    Code:
    gen PrelimDate = real(Fdate)
    And then I turned them into date format with this code, following advice:

    Code:
    gen GoodFDate = daily(string(PrelimDate, "%8.0f"), "YMD")
    But there were some missing observations in GoodFDate compared to PrelimDate.

    As I look through the dataset I've found that some Fdate - when they are first or last day of the month - were misconverted as 20100200 in PrelimDate. (originally it was 20100201.)

    That is the reason that the number of observations for GoodFDate was a bit smaller than that of PrelimDate.

    My question is,
    1. why did this happen?
    2. How can I fix it?

    Please help!
    Last edited by Eunsang Jee; 17 Oct 2019, 05:07.

  • #2
    This is not how you handle dates in Stata. Do read

    Code:
    help datetime
    for an extensive treatment of date and time variables. The solution is far easier than what you had in mind.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10(FirmNo Yr Fdate Pdate)
    "002410" "2010" "20100120" "20100221"
    "002410" "2011" "20110121" "20110222"
    "002410" "2012" "20120124" "20120222"
    "002319" "2010" "20100127" "20100219"
    "002319" "2011" "20110124" "20110220"
    "002319" "2012" "20120119" "20120222"
    end
    
    gen fdate= date(Fdate, "YMD")
    gen pdate= date(Pdate, "YMD")
    format fdate pdate %td
    Res.:

    Code:
    . l FirmNo Yr fdate pdate, sep(6)
    
         +---------------------------------------+
         | FirmNo     Yr       fdate       pdate |
         |---------------------------------------|
      1. | 002410   2010   20jan2010   21feb2010 |
      2. | 002410   2011   21jan2011   22feb2011 |
      3. | 002410   2012   24jan2012   22feb2012 |
      4. | 002319   2010   27jan2010   19feb2010 |
      5. | 002319   2011   24jan2011   20feb2011 |
      6. | 002319   2012   19jan2012   22feb2012 |
         +---------------------------------------+

    Comment

    Working...
    X