Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting Monthly Date from String with Multiple Daily Date Formats

    Hi all,

    Code:
    input str11 startdate2
    "Apr 1 1997"
    "Apr 1 1997"
    "Apr 1 1999"
    "Apr 1 1999"
    "Apr 1 1999"
    "Apr 1 2000"
    "Apr 1 2000"
    "Apr 1 2000"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jun 1 2002"
    "Jun 1 2002"
    "Jun 1 2002"
    "Jun 1 2003"
    "23-Jun-10"
    "23-Jun-14"
    "23-May-11"
    "23-Nov-13"
    "23-Oct-14"
    I have a dataset with this date variable, which is a string. It is a very weird variable, as the daily date is shown in two different formats: MMM D YYYY and DD-MMM-YY.

    I would like to extract the monthly date from all observations of this variable, i.e. for both date formats.

    I tried the very helpful code of Nick Cox in the following post: https://www.statalist.org/forums/for...rom-a-date-var and the user-written numdate command (from SSC), however I only got missing values generated.

    Please could someone let me know how to extract a monthly date out of this variable?
    Last edited by Maxence Morlet; 16 Sep 2023, 01:57. Reason: EDIT: Title of post made more accurate

  • #2
    I would get a daily date first, as otherwise you just have to ignore the day part in two different ways.

    Code:
    clear 
    input str11 startdate2
    "Apr 1 1997"
    "Apr 1 1997"
    "Apr 1 1999"
    "Apr 1 1999"
    "Apr 1 1999"
    "Apr 1 2000"
    "Apr 1 2000"
    "Apr 1 2000"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jul 1 2001"
    "Jun 1 2002"
    "Jun 1 2002"
    "Jun 1 2002"
    "Jun 1 2003"
    "23-Jun-10"
    "23-Jun-14"
    "23-May-11"
    "23-Nov-13"
    "23-Oct-14"
    end 
    
    duplicates drop 
    gen wanted = daily(startdate2, "MDY")
    replace wanted = daily(startdate2, "DM20Y") if missing(wanted)
    sort wanted 
    format wanted %td 
    list 
    
         +------------------------+
         | startdate2      wanted |
         |------------------------|
      1. | Apr 1 1997   01apr1997 |
      2. | Apr 1 1999   01apr1999 |
      3. | Apr 1 2000   01apr2000 |
      4. | Jul 1 2001   01jul2001 |
      5. | Jun 1 2002   01jun2002 |
         |------------------------|
      6. | Jun 1 2003   01jun2003 |
      7. |  23-Jun-10   23jun2010 |
      8. |  23-May-11   23may2011 |
      9. |  23-Nov-13   23nov2013 |
     10. |  23-Jun-14   23jun2014 |
         |------------------------|
     11. |  23-Oct-14   23oct2014 |
         +------------------------+
    Now it's just

    Code:
    replace wanted = mofd(wanted)

    Comment


    • #3
      Thank you very much! It worked a wonder

      Comment

      Working...
      X