Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Separating a string variable into separate variables

    I have a string variable CODEX which has the underlying cause of death coded first, and any secondary causes of death following. The cause of deaths codes are from the ICD-10 codes (International Classification for Disease). All the causes of death are separated by spaces or commas.

    I am attempting to separate this string variable CODEX into separate cause of death variables DEATH1, DEATH2 etc. Any suggestions for potential codes for this would be greatly appreciated as I am not too sure how to approach this issue.

  • #2
    Code:
    help split

    Comment


    • #3
      Hi Joseph, thanks for that - I just used the following: split codex, parse(","" ") gen(Death)

      This has now created variables Death1 to Death10 as string.

      If I wanted to dermine how many code there were that began with the letter "C" and "D48", what would be the most efficient way of doing this. I did the 'list' command but I am guessing there is a better way of doing it without having to count manually...

      Comment


      • #4
        It's not clear what you're looking for here.

        Do you want the count within decedent of the total number of causes of death whose ICD-10 codes begin with either a "C" or a "D48", that is by-row count? Maybe something along the lines of
        Code:
        generate byte tot = 0
        foreach var of varlist Death1-Death10 {
            quietly replace tot = tot + 1 if substr(`var', 1, 1) == "C" | substr(`var', 1, 3) == "D48"
        }
        Or do you want the total number of decedents for whom at least one of the cause-of-death ICD-10 codes begins with "C" or "D48"? You could do this latter on either the variable generated from the former, for example,
        Code:
        count if tot > 0
        or on the original concatenated string.

        Comment

        Working...
        X