Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ICD 10 Codes dataset: Please help

    Hi there,

    Please I urgently need help with generate new variable consisting ICD 10 codes related to substance use from existing ICD 10 codes variable.
    To be specific, I am working on a dataset that consist numerous ICD 10 codes as observations, however, I am only interested in ICD 10 codes related to substance use (alcohol and illicit drug use). Of note, I have a list of all the codes I am interested in.
    Hence, I will like to extract all the ICD 10 codes related to substance use under a new variable.
    Please can someone guide me as to how to go about this!!!

  • #2
    your set up is not completely clear to me but here is a guess: make a new data set of just the substance use codes and then merge the two dates; _merge=3 is then what you want; see
    Code:
    help merge

    Comment


    • #3
      I can help with that.

      Let's say your ICD codes are in a variable called icd10.

      Substance use is the F10 to F19 range in ICD.

      So what I would do:

      gen new_icd10=icd10 if strmatch(icd10, "F1*")

      A string match looks for where a variable contains certain characters; the asterisk shows where anything can come before or after. So what this does is create a new variable, new_icd10, that copies the value from your original variable if they start with the characters F1, no matter what comes after that. If it doesn't start with F1, it will be a blank value.

      You would end up with something like this:
      icd10 new_icd10
      F10.151 F10.151
      F20.2 -
      F13.92 F13.92
      J01.0 -
      G09 -
      F11.10 F11.10
      Just a note: depending on what you're working on, you may want to exclude nicotine from your analysis. Nicotine is the F17 range.

      Comment


      • #4
        In Stata, you have - icd10 - command for that matter. This is a very handy approach when dealing with ICD 10 codes.
        Best regards,

        Marcos

        Comment


        • #5
          Thank you so much Shannon, I think this is what I need. I will try it now and let you know if it works.

          Comment


          • #6
            Originally posted by Shannon Campbell View Post
            I can help with that.

            Let's say your ICD codes are in a variable called icd10.

            Substance use is the F10 to F19 range in ICD.

            So what I would do:

            gen new_icd10=icd10 if strmatch(icd10, "F1*")

            A string match looks for where a variable contains certain characters; the asterisk shows where anything can come before or after. So what this does is create a new variable, new_icd10, that copies the value from your original variable if they start with the characters F1, no matter what comes after that. If it doesn't start with F1, it will be a blank value.

            You would end up with something like this:
            icd10 new_icd10
            F10.151 F10.151
            F20.2 -
            F13.92 F13.92
            J01.0 -
            G09 -
            F11.10 F11.10
            Just a note: depending on what you're working on, you may want to exclude nicotine from your analysis. Nicotine is the F17 range.
            Hi Shannon, thanks much, the syntax works.

            Comment


            • #7
              Originally posted by Helen Oni View Post

              Hi Shannon, thanks much, the syntax works.
              Hi Shannon, I'm having trouble extracting more that one code at a time. For instant, I want to generate new variable that consist of the ICD10 F and ICD10 let say G at the same time.
              something like this: gen new_icd10p123=diagnosis_codep if strmatch(diagnosis_codep, "F1*" "G3*" "G6*") but this didn't work.

              Please help.

              Comment


              • #8
                Hi, Helen -

                I saw your private message, but it seems I can't reply to you directly. Hope you see this.

                If you want it to be more than one string match at a time, you put a | between each clause. That basically means "or" in this context; if it matches this or this, do this.

                gen new_icd10p123=diagnosis_codep if strmatch(diagnosis_codep, "F1*") | strmatch(diagnosis_codep, "G3*") | strmatch(diagnosis_codep, "G6*")

                The only problem is that there is a limit - I think you can't do more than 8 clauses at a time. So if you have more than 8 codes, you'd probably want to switch to doing a replace after the first line.

                You do your gen new, and then...

                replace new_icd10p123=diagnosis_codep if strmatch(diagnosis_codep, "F2*") | strmatch(diagnosis_codep, "F3*") [etc.]

                Comment

                Working...
                X