Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing substrings in dates formatted as strings

    Hello,

    My variable DateofDiagnosis is a list of dates. Some have 99 for the day, 99 for the month, and 9999 for the year.
    I want to replace ??/99 as "15/06", and replace ??/??/9999 as "".

    This is my data
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 DateofDiagnosis
    "08/99/1997"
    "12/99/1999"
    "10/99/1999"
    "03/99/1996"
    "99/99/1977"
    "99/99/9999"
    end
    I replaced trailing values using
    Code:
    replace DateofDiagnosis = "" if substr(DateofDiagnosis, -4, .) == "9999"
    but I can't figure out how to replace the leading values.

    I tried
    Code:
    replace DateofDiagnosis = "15/06" + substr(DateofDiagnosis, -5, .) if subst(DateofDiagnosis, -7, -6) == "99"
    But 0 real changes were made.

    Edit: I also tried
    Code:
     
    replace DateofDiagnosis = subinstr(DateofDiagnosis, "??/99", "15/06", .)
    but this led to 0 real changes being made.

    What am I missing? Is there a simpler way to do this?
    Last edited by August Anderson; 29 Aug 2018, 00:37. Reason: edit: added another try

  • #2
    Thanks for the clear data example. subinstr() is literal: it won't interpret ?? as a wildcard. substr() requires the length of the substring as third argument, so -6 makes no sense.

    If your years are 9999 there is nothing useful in the date and your first replace command is fine.

    Otherwise is this what you want?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 DateofDiagnosis
    "08/99/1997"
    "12/99/1999"
    "10/99/1999"
    "03/99/1996"
    "99/99/1977"
    "99/99/9999"
    end
    
    replace Date = "" if substr(Date, -4, 4) == "9999"
    
    replace Date = "15" + substr(Date, 3, .) if substr(Date, 1, 2) == "99"
    
    replace Date = substr(Date, 1, 3) + "06" + substr(Date, 6, .) if substr(Date, 4, 2) == "99"
    For a different approach see https://www.stata-journal.com/sjpdf....iclenum=dm0062

    Comment


    • #3
      Thank you Nick, this works well, and the Stata article raised some good ideas

      I think my error was the third argument in substr()
      This code does it in one step
      Code:
      replace DateofDiagnosis = "15/06" + substr(DateofDiagnosis, -5, .) if substr(DateofDiagnosis, -7, 2) == "99"

      Comment

      Working...
      X