Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recoding Multiple Variables and give them consecutive values

    Hello,

    I am dealing with ICD10 database for nearly 250K observations, what I would like to do in essence is to classified all the diagnoses into the 21 ICD10 chapters. I have created variables ICD10_Chapter1 - ICD10_Chapter21 and each of these variables has a 0 and 1 values. What I wand to do now is to recode all of them ICD_Chapter* but instead to have a 0 and 1 in the new variable, I want to recode them as follow:
    ICD10_Chapter1 = 1
    ICD10_Chapter2 = 2
    ICD10_Chapter3 = 3
    ICD10_Chapter4 = 4
    ICD10_Chapter5 = 5
    ICD10_Chapter21 = 21

    to give you an idea of what I did, I used icd10 gen to code each chapter separately, for example, chapter 11 as this: icd10 gen ICD10_Chapters11 = PrincDiag, range(K00/K93)

    I included the main diagnosis codes if someone has better and efficient approach.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 PrincDiag byte(ICD10_Chapters ICD10_Chapters2 ICD10_Chapters3 ICD10_Chapters4 ICD10_Chapters5 ICD10_Chapters6 ICD10_Chapters7 ICD10_Chapters8 ICD10_Chapters9 ICD10_Chapters10)
    "O342" 0 0 0 0 0 0 0 0 0 0
    "R104" 0 0 0 0 0 0 0 0 0 0
    "K358" 0 0 0 0 0 0 0 0 0 0
    "J931" 0 0 0 0 0 0 0 0 0 1
    "O249" 0 0 0 0 0 0 0 0 0 0
    "I10"  0 0 0 0 0 0 0 0 1 0
    "N202" 0 0 0 0 0 0 0 0 0 0
    "I249" 0 0 0 0 0 0 0 0 1 0
    "K649" 0 0 0 0 0 0 0 0 0 0
    "O269" 0 0 0 0 0 0 0 0 0 0
    "O800" 0 0 0 0 0 0 0 0 0 0
    "I501" 0 0 0 0 0 0 0 0 1 0
    "M480" 0 0 0 0 0 0 0 0 0 0
    "C509" 0 1 0 0 0 0 0 0 0 0
    "Q211" 0 0 0 0 0 0 0 0 0 0
    "A239" 1 0 0 0 0 0 0 0 0 0
    "N133" 0 0 0 0 0 0 0 0 0 0
    "N939" 0 0 0 0 0 0 0 0 0 0
    "O800" 0 0 0 0 0 0 0 0 0 0
    "I64"  0 0 0 0 0 0 0 0 1 0
    "C543" 0 1 0 0 0 0 0 0 0 0
    "J988" 0 0 0 0 0 0 0 0 0 1
    "O800" 0 0 0 0 0 0 0 0 0 0
    "N210" 0 0 0 0 0 0 0 0 0 0
    "J459" 0 0 0 0 0 0 0 0 0 1
    "A239" 1 0 0 0 0 0 0 0 0 0
    "O034" 0 0 0 0 0 0 0 0 0 0
    "C169" 0 1 0 0 0 0 0 0 0 0
    "O800" 0 0 0 0 0 0 0 0 0 0
    "R104" 0 0 0 0 0 0 0 0 0 0
    "E111" 0 0 0 1 0 0 0 0 0 0
    "J189" 0 0 0 0 0 0 0 0 0 1
    "J069" 0 0 0 0 0 0 0 0 0 1
    "I739" 0 0 0 0 0 0 0 0 1 0
    "K409" 0 0 0 0 0 0 0 0 0 0
    "K566" 0 0 0 0 0 0 0 0 0 0
    "C509" 0 1 0 0 0 0 0 0 0 0
    "U071" 0 0 0 0 0 0 0 0 0 0
    "E114" 0 0 0 1 0 0 0 0 0 0
    "I214" 0 0 0 0 0 0 0 0 1 0
    "O800" 0 0 0 0 0 0 0 0 0 0
    "A153" 1 0 0 0 0 0 0 0 0 0
    "K358" 0 0 0 0 0 0 0 0 0 0
    "O441" 0 0 0 0 0 0 0 0 0 0
    "C509" 0 1 0 0 0 0 0 0 0 0
    "I219" 0 0 0 0 0 0 0 0 1 0
    "N309" 0 0 0 0 0 0 0 0 0 0
    "F311" 0 0 0 0 1 0 0 0 0 0
    "R572" 0 0 0 0 0 0 0 0 0 0
    "K358" 0 0 0 0 0 0 0 0 0 0
    end

    Thank you

  • #2
    In your data example you don't have any of the variables ICD10_Chapter1 - ICD10_Chapter21

    The names on display are ICD10_Chapters ICD10_Chapters2 through to ICD10_Chapters10

    Guessing from a mix of what you claim to have and what you show, this might help

    Code:
    rename ICD10_Chapters ICD10_Chapters1 
    
    forval j = 1/21 { 
         replace ICD10_Chapters`j'  = IC10_Chapters`j' * `j' 
    }
    but make sure you have saved your data before you do this, so that you can back up if this messes up your data.

    Comment


    • #3
      Thanks Nick, yea I tried to include all the 21 variables but Stata said too many variables specified that is why I limited it to 10 as an example.

      Is there a way to generate a new variable instead of replacing the existing ones because I don't want to mess up with the original data as you suggest.
      Also, when I run your command above after renaming the first variable, I got an error message "IC10_Chapters1 not found" although it exist?

      Comment


      • #4
        I think I have not explain it well. In each variable of ICD10_Chapters(1-21) there are 0 and 1 values, so

        ICD10_Chapters1 has for example 4500=1 and the rest =0
        ICD10_Chapters2 has for example 6320=1 and the rest =0
        ICD10_Chapters3 has for example 846=1 and the rest =0
        ICD10_Chapters4 has for example 1238=1 and the rest =0
        ICD10_Chapters5 has for example 794=1 and the rest =0
        etc..

        I want to generate a new variable and recode all of these variables and give them a consecutive values according to their chapter number, so as above example:

        ICD10_ALL_Chapter = 1 if ICD10_Chapters1==1

        replace ICD10_ALL_Chapter = 2 if ICD10_Chapters2 ==1

        replace ICD10_ALL_Chapter = 3 if ICD10_Chapters3 ==1

        replace ICD10_ALL_Chapter = 4 if ICD10_Chapters4 ==1

        Comment


        • #5
          The error message

          Code:
          IC10_Chapters1 not found 
          is the result of a typo on my part. Sorry about that. The code naturally should have been


          Code:
            
           replace ICD10_Chapters`j'  = ICD10_Chapters`j' * `j'
          The request in #4 is different and asks for a single summary variable. But you can use the advice already given in #2 -- use a loop, not a statement for every such variable.

          For example,

          Code:
          gen ICD10_ALL_Chapter = 0
          
          forval j = 1/21 {
                replace ICD10_ALL_Chapter = `j'  if ICD10_Chapters`j' ==1
          }
          However, this summary is of limited use if any patient is 1 on two or more variables. It then becomes the code of the last such variable.
          Last edited by Nick Cox; 22 Oct 2022, 07:51.

          Comment


          • #6
            Works well. It would not because it is impossible to have two or more different primary diagnosis code for one observation. Thank Nick!

            Comment

            Working...
            X