Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting Date of Birth from ID

    Hi everyone,

    Suppose that there is data with civil ID and date of birth, many times you will find values for date of birth are missing, in this case, it would be useful to obtain the date of birth from the civil ID and verify the data and double check others.

    Lets say the civil ID has 12 digits and look something like this:

    310051209216
    265102508032

    The first digit to the left indicate the century and the second and thirds digits to the left indicate the year so, 310 means that person born in 2010 and 265 means that person born in 1965. The fourth and fifth digits from the left indicate the month and the sixth and seventh digits from left indicate the days. As such the first person according to the civil ID born on 12 of May 2010, while the second person born on 25 October 1965.

    Is there a command to extract those specific digits and convert them into dates like shown above?

    I tried:

    Code:
    gen yrs   = real(substr( var , -12, 3))
    gen months = real(substr( var , -9, 2))
    gen days = real(substr( var , -7, 2))
    But it will be problematic to convert those three variables into one variable representing as a date. Is there any other more elegant way to do that?


    Thanks!

  • #2
    Code:
    clear
    input str12 id
    310051209216
    265102508032
    end
    
    gen year = cond(substr(id, 1, 1) == "2", 1900, 2000) + real(substr(id, 2, 2))
    gen bdate = mdy(real(substr(id, 4, 2)), real(substr(id, 6, 2)), year)
    format bdate %td
    
    list
    
         +---------------------------------+
         |           id   year       bdate |
         |---------------------------------|
      1. | 310051209216   2010   12may2010 |
      2. | 265102508032   1965   25oct1965 |
         +---------------------------------+
    Another way to do it:

    Code:
    gen YEAR =  cond(substr(id, 1, 1) == "2", "19", "20") + substr(id, 2, 2)
    gen BDATE = daily(substr(id, 4, 4) + YEAR, "MDY")
    format BDATE %td
    Each method could be reduced to 2 lines, with perhaps some loss of legibility.
    Last edited by Nick Cox; 31 May 2023, 16:13.

    Comment


    • #3
      Wow! That is brilliant Nick. Thank you!

      Comment

      Working...
      X