Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping long to wide problems

    Hello everyone,

    I am dealing with a large national database (TQIP) which has a series of variables (ICD and AIS codes) stored in long datasets such as:

    inc_key icdprocedurecode icdprocedurecode_biu icdprocedureversion proceduremins proceduredays
    1.800e+11 BW28ZZZ ICD10 145 1
    1.800e+11 B030ZZZ ICD10 3149 3
    1.800e+11 BR29ZZZ ICD10 165 1
    1.800e+11 BW251ZZ ICD10 151 1
    1.800e+11 BR20ZZZ ICD10 161 1
    1.800e+11 BR27ZZZ ICD10 164 1
    1.800e+11 BR29ZZZ ICD10 10 1

    1.900e+11 BW28ZZZ ICD10 9 1
    1.900e+11 BR20ZZZ ICD10 9 1
    1.900e+11 BW25YZZ ICD10 10 1
    1.900e+11 BW40ZZZ ICD10 5 1
    1.900e+11 BR27ZZZ ICD10 10 1
    1.900e+11 0HQ0XZZ ICD10 40 1
    1.900e+11 BW28ZZZ ICD10 30 1
    1.900e+11 B24CZZZ ICD10 1

    Essentially, I have one common variable i (inc_key) to use in reshape but there is no unique j variable to use with reshape such as year or numbering of any kind since the order of the ICD 10 codes does not matter for this analysis-- I just need them separated and categorized into a single inc_key number for each (so icdcode1, icdcode2, etc. for each unique inc_key number), but I'm not sure how to do this.

    I hope this wasn't confusing.

  • #2
    Welcome to Statalist.

    I hope this wasn't confusing.
    Glad that you said it! Please refer to the FAQ (http://www.statalist.org/forums/help) and use -dataex- to provide some data example. That way forum users can set up the data and check their codes. And please also be aware that by "example" they do not have to be real data, it just have to be data that reflect the actual format. Also, your inc_key seems to be expressed in scientific expression, please double check if they are in the right data type (https://www.stata.com/manuals/ddatatypes.pdf).

    Comment


    • #3
      Thank you! Here's a more representative dataset. inc_key is not in scientific notation in the actual variable:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double inc_key str8 icdprocedurecode byte icdprocedurecode_biu str5 icdprocedureversion long proceduremins int proceduredays
      180008916803 "BW28ZZZ" . "ICD10"   145  1
      180008916803 "B030ZZZ" . "ICD10"  3149  3
      180008916803 "BR29ZZZ" . "ICD10"   165  1
      180008916803 "BW251ZZ" . "ICD10"   151  1
      180008916803 "BR20ZZZ" . "ICD10"   161  1
      180008916803 "BR27ZZZ" . "ICD10"   164  1
      180008916805 "BR29ZZZ" . "ICD10"    10  1
      180008916805 "BW28ZZZ" . "ICD10"     9  1
      180008916805 "BR20ZZZ" . "ICD10"     9  1
      180008916805 "BW25YZZ" . "ICD10"    10  1
      180008916805 "BW40ZZZ" . "ICD10"     5  1
      180008916805 "BR27ZZZ" . "ICD10"    10  1
      180008916805 "0HQ0XZZ" . "ICD10"    40  1
      180008916807 "BW28ZZZ" . "ICD10"    30  1
      180008916807 "B24CZZZ" . "ICD10"     .  1
      180008916807 "BW40ZZZ" . "ICD10"     .  1
      180008916807 "BN25ZZZ" . "ICD10"    30  1
      180008916807 "BR20ZZZ" . "ICD10"    30  1
      180008916807 "0HQ0XZZ" . "ICD10"    35  1
      180008916807 "BN26ZZZ" . "ICD10"    30  1
      180008916807 "BW251ZZ" . "ICD10"    30  1
      180008916807 "2W38X1Z" . "ICD10"     .  1
      180008916813 "BW40ZZZ" . "ICD10"    28  1
      180008916813 "BR20ZZZ" . "ICD10"    37  1
      180008916813 "BW28ZZZ" . "ICD10"    36  1
      180008916813 "BW25YZZ" . "ICD10"    37  1
      180008916813 "BR29ZZZ" . "ICD10"    37  1
      180008916813 "BR27ZZZ" . "ICD10"    37  1
      180008916817 "0RRL0JZ" . "ICD10"  1297  2
      180008916817 "BW28ZZZ" . "ICD10"   160  1
      180008916817 "0JQ13ZZ" . "ICD10"   468  1
      180008916817 "BR20ZZZ" . "ICD10"   159  1
      180008916817 "BP2EZZZ" . "ICD10"   460  1
      180008916817 "01Q40ZZ" . "ICD10"  1297  2
      180008916819 "04CL0ZZ" . "ICD10"  8194  7
      180008916819 "4A02X4Z" . "ICD10"  9708  8
      180008916819 "04UL0KZ" . "ICD10"  8194  7
      180008916819 "B24BZZZ" . "ICD10"  6004  5
      180008916819 "0T9B70Z" . "ICD10"  8183  7
      180008916819 "BR39ZZZ" . "ICD10"  3919  4
      180008916819 "BQ3MZZZ" . "ICD10"  3920  4
      180008916819 "BR37ZZZ" . "ICD10"   692  2
      180008916822 "0QS904Z" . "ICD10"  1013  2
      180008916824 "0SBC4ZZ" . "ICD10"   399  1
      180008916824 "0JQN3ZZ" . "ICD10"   399  1
      180008916831 "BQ20ZZZ" . "ICD10"   288  1
      180008916834 "BR20ZZZ" . "ICD10"    33  1
      180008916834 "0JQ03ZZ" . "ICD10"     8  1
      180008916834 "BW28ZZZ" . "ICD10"    33  1
      180008916848 "B24BZZZ" . "ICD10"  1565  2
      180008916848 "0HQ1XZZ" . "ICD10"   154  1
      180008916848 "BP29ZZZ" . "ICD10"   124  1
      180008916848 "BW28ZZZ" . "ICD10"    54  1
      180008916848 "BR20ZZZ" . "ICD10"    55  1
      180008916850 "0Y6X0Z0" . "ICD10"  5304  5
      180008916850 "B44FZZZ" . "ICD10"  2115  3
      180008916850 "0JBQ0ZZ" . "ICD10"  5304  5
      180008916856 "B31R1ZZ" . "ICD10"     .  3
      180008916856 "BR20ZZZ" . "ICD10"    37  1
      180008916856 "BW191ZZ" . "ICD10"     .  3
      180008916856 "BW40ZZZ" . "ICD10"     9  1
      180008916856 "BQ2GZZZ" . "ICD10"    37  1
      180008916856 "BW211ZZ" . "ICD10"    37  1
      180008916856 "B030ZZZ" . "ICD10"     .  3
      180008916856 "BW241ZZ" . "ICD10"    37  1
      180008916856 "B24CZZZ" . "ICD10"     9  1
      180008916856 "BW28ZZZ" . "ICD10"    37  1
      180008916858 "BP1JZZZ" . "ICD10"  1406  2
      180008916858 "0PSH04Z" . "ICD10"  1406  2
      180008916858 "2W38X1Z" . "ICD10"   127  1
      180008916861 "B54PZZZ" . "ICD10" 10182  8
      180008916861 "05HC33Z" . "ICD10" 28185 21
      180008916861 "0NS00ZZ" . "ICD10" 29998 22
      180008916861 "5A1945Z" . "ICD10"     .  1
      180008916861 "009630Z" . "ICD10" 33792 25
      180008916861 "0T9B70Z" . "ICD10"  5469  5
      180008916861 "30233R1" . "ICD10"    86  1
      180008916861 "00C40ZZ" . "ICD10"   139  2
      180008916861 "BW28ZZZ" . "ICD10"   449  2
      180008916861 "B24BZZZ" . "ICD10"  5365  5
      180008916862 "BQ13ZZZ" . "ICD10"  1566  2
      180008916862 "0QSB04Z" . "ICD10"  1566  2
      180008916862 "0T9B70Z" . "ICD10"   553  1
      180008916868 "BW251ZZ" . "ICD10"    19  1
      180008916869 "0W9B30Z" . "ICD10"     .  2
      180008916869 "BW25ZZZ" . "ICD10"    66  1
      180008916869 "BW28ZZZ" . "ICD10"    64  1
      180008916869 "0HQ1XZZ" . "ICD10"   355  2
      180008916869 "0CQ0XZZ" . "ICD10"   355  2
      180008916869 "30233N1" . "ICD10"    34  1
      180008916869 "3E10X8Z" . "ICD10"   355  2
      180008916869 "B24CZZZ" . "ICD10"    21  1
      180008916869 "BR20ZZZ" . "ICD10"    75  1
      180008916869 "2W31X9Z" . "ICD10"   356  2
      180008916869 "BR29ZZZ" . "ICD10"    80  1
      180008916869 "BN25ZZZ" . "ICD10"    64  1
      180008916869 "2W31X9Z" . "ICD10"   355  2
      180008916869 "BW40ZZZ" . "ICD10"    21  1
      180008916869 "BR27ZZZ" . "ICD10"    79  1
      180008916869 "5A12012" . "ICD10"   918  2
      end
      180008916813 "BW40ZZZ" . "ICD10" 28 1

      As you can see, there's several non-unique inc_key variables, each with a unique ICD-10 code.
      Last edited by Avanti Badrinathan; 30 Nov 2021, 18:11.

      Comment


      • #4
        When there is no suitable j() variable in the data, you have to create one. If the order doesn't matter for analysis, then any order will do. The following code orders them alphabetically by icdprocedurecode.

        Code:
        by inc_key (icdprocedurecode), sort: gen _j = _n
        reshape wide icdprocedure* procedure*, i(inc_key) j(_j)
        Note: I don't know what you are going to be doing next with the data. Suffice it to be warned that there aren't many things in Stata that work well with wide data. So, unless you will be doing something that truly requires changing to wide layout, you are better off leaving the data as is.

        Comment

        Working...
        X