Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Dataset from Wide to Long (with many variables and waves).

    Dear Statalist community,

    This might be a simple question, but I am having trouble reshaping my dataset from wide to long.

    Essentially, I was wondering, when I have 6000 variables that occur across three wave periods and follow a consistent naming pattern like "Aw1", "Aw2", "Aw3", "Bw1", "Bw2", "Bw3", and so on, is there any easy way to reshape the data to long without typing all the stub names?

    I have used the explanation previously provided here (https://www.stata.com/support/faqs/d...-with-reshape/) to create an initial code, but I have been getting a syntax error.

    Code that I have been using:
    Code:
    unab vars : *w1 *w2 *w3 
    local stubs1 : subinstr local vars "w1" "", all 
    local stubs2 : subinstr local vars "w2" "", all 
    local stubs3 : subinstr local vars "w3" "", all 
    
    reshape long `stubs1' `stubs2' `stubs3', i(id_key) j(wave) string
    Error message I got for using the code:
    invalid syntax
    r(198);
    Any insight would be greatly appreciated!


    Here is an example dataset as needed:
    Code:
    * Example generated by -dataex-. To install: ssc install    dataex
    clear
    input byte(ID Aw1 Aw2 Aw3 Bw1 Bw2 Bw3 Cw1 Cw2 Cw3)
    1 50 50 50 50 50 50 50 50 50
    2 60 60 60 60 60 60 60 60 60
    3 70 70 70 70 70 70 70 70 70
    4 80 80 80 80 80 80 80 80 80
    5 90 90 90 90 90 90 90 90 90
    end
    Last edited by Mariame Shaw; 08 Aug 2023, 10:48.

  • #2
    If the same variables are used in all three waves, then your local macros stubs1, stubs2, and stubs3 will all be the same. Even if it is not entirely the same variables used in all waves, there is probably substantial overlap. As a result there will be repetitions of the same stub among stubs1, stubs2, and stubs3. That is the source of your syntax error. If all of the stubs do occur in wave 1, then you only need local stubs1. In the worst case scenario where there are some "new" variables in waves 2 and also in waves 3, then after you define local macros stubs1, stubs2, and stubs3 you can do this:
    Code:
    local all_stubs `stubs1' `stubs2' `stubs3'
    local all_stubs: list uniq all_stubs
    reshape long `all_stubs', i(id_key) j(wave) string
    Correction: There is another problem. Your definitions of local macros stubs1, stubs2, and stubs3 leaves behind variables ending in *w2 in stubs1 and stubs3, and ending in *w1 in stubs2 and stubs3, etc. It should be this:
    Code:
    unab stubs: *w1 *w2 *w3
    forvalues i = 1/3 {
        local stubs: subinstr local stubs "w`i'" "", all
    }
    local stubs: list uniq stubs
    reshape long `stubs', i(id_key) j(wave) string
    Last edited by Clyde Schechter; 08 Aug 2023, 10:44.

    Comment


    • #3
      Thank you so very much!

      I believe this has answered my main question!

      I did run into the issue, though, where I am getting the following error after the code:

      (note: j = w1 w2 w3)
      variable cxxp_w1 type mismatch with other cxxp variables
      r(198);
      I have gotten the error multiple times with different variables. I was thus wondering if there was a way to potentially list all the variables that might have a mismatch so that I can recode them at once, rather than having to run the code each time just to get the message that there is a mismatch with another variable?

      THANK YOU SO MUCH AGAIN!

      Comment


      • #4
        As no sample data has been provided, this code is untested. It may contain typos or other errors, but it has the gist of a solution. If you need assistance adapting it to your data, please post back using the -dataex- command to show a relevant data example.

        Code:
        foreach s of local stubs {
            local types
            forvalues i = 1/3 {
                capture local type`i': type `s'w`i'
                if (c(rc) == 0 {    
                    local types `types' `type`i''
                }
                else if c(rc) != 111 { // SKIP IF VARIABLE `s'w`i' DOES NOT EXIST
                    display as error `"Unexpected error processing stub `s'"'
                    exit c(rc)
                }
                local types: list sort types
                local n_vars: word count `types'
                if `"`:word 1 of `types''"' != `"`:word `n_vars' of `types''"' { // MISMATCH FOUND
                    display `"Stub `s' mismatch: `types'"'
                }
            }
        }
        If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Comment


        • #5
          Thank you so very much for all the help!!

          I hope I did this right, but here is an example of one of the variables which I received an error from!

          Error message:
          (note: j = w1 w2 w3)
          variable ds006_w3 type mismatch with other ds006 variables
          r(198);
          Here is a sample of my dataset
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float id_key byte ds006_4_w1 int(ds006_7_w1 ds006_3_w1) long(ds006_2_w1 ds006_1_w1) str15 ds006_w2 long ds006_w3
            1 . . . .    . "" .
            2 . . . .    . "" .
            3 . . . .    . "" .
            4 . . . .    . "" .
            5 . . . .    . "" .
            6 . . . .    . "" .
            7 . . . .    . "" .
            8 . . . .    . "" .
            9 . . . .    . "" .
           10 . . . .    . "" .
           11 . . . .    . "" .
           12 . . . .    . "" .
           13 . . . .    . "" .
           14 . . . .    . "" .
           15 . . . .    . "" .
           16 . . . .    . "" .
           17 . . . .    . "" .
           18 . . . .    . "" .
           19 . . . .    . "" .
           20 . . . .    . "" .
           21 . . . .    . "" .
           22 . . . .    . "" .
           23 . . . .    . "" .
           24 . . . .    . "" .
           25 . . . .    . "" .
           26 . . . .    . "" .
           27 . . . .    . "" .
           28 . . . .    . "" .
           29 . . . .    . "" .
           30 . . . .    . "" .
           31 . . . .    . "" .
           32 . . . .    . "" .
           33 . . . .    . "" .
           34 . . . .    . "" .
           35 . . . .    . "" .
           36 . . . .    . "" .
           37 . . . .    . "" .
           38 . . . .    . "" .
           39 . . . .    . "" .
           40 . . . .    . "" .
           41 . . . .    . "" .
           42 . . . .    . "" .
           43 . . . .    . "" .
           44 . . . .    . "" .
           45 . . . .    . "" .
           46 . . . .    . "" .
           47 . . . .    . "" .
           48 . . . .    . "" .
           49 . . . .    . "" .
           50 . . . .    . "" .
           51 . . . .    . "" .
           52 . . . .    . "" .
           53 . . . .    . "" .
           54 . . . .    . "" .
           55 . . . .    . "" .
           56 . . . .    . "" .
           57 . . . .    . "" .
           58 . . . .    . "" .
           59 . . . .    . "" .
           60 . . . .    . "" .
           61 . . . .    . "" .
           62 . . . .    . "" .
           63 . . . .    . "" .
           64 . . . .    . "" .
           65 . . . .    . "" .
           66 . . . .    . "" .
           67 . . . .    . "" .
           68 . . . .    . "" .
           69 . . . .    . "" .
           70 . . . .    . "" .
           71 . . . .    . "" .
           72 . . . . 1000 "" .
           73 . . . .    . "" .
           74 . . . .    . "" .
           75 . . . .    . "" .
           76 . . . .    . "" .
           77 . . . .    . "" .
           78 . . . .    . "" .
           79 . . . .    . "" .
           80 . . . .    . "" .
           81 . . . .    . "" .
           82 . . . .    . "" .
           83 . . . .    . "" .
           84 . . . .    . "" .
           85 . . . .    . "" .
           86 . . . .    . "" .
           87 . . . .    . "" .
           88 . . . .    . "" .
           89 . . . .    . "" .
           90 . . . .    . "" .
           91 . . . .    . "" .
           92 . . . .    . "" .
           93 . . . .    . "" .
           94 . . . .    . "" .
           95 . . . .    . "" .
           96 . . . .    . "" .
           97 . . . .    . "" .
           98 . . . .    . "" .
           99 . . . .    . "" .
          100 . . . .    . "" .
          end
          label values ds006_3__101 MISSINGS
          label values ds006_2__101 MISSINGS
          label values ds006_1__101 MISSINGS

          Comment


          • #6
            OK, I had some of the closing braces in the code to check for type mismatches misplaced. Here's a corrected version which works correctly on your example data.

            Code:
            unab stubs: *w1 *w2 *w3
            forvalues i = 1/3 {
                local stubs: subinstr local stubs "w`i'" "", all
            }
            local stubs: list uniq stubs
            macro list _stubs
            
            foreach s of local stubs {
                local types
                forvalues i = 1/3 {
                    capture local type`i': type `s'w`i'
                    if c(rc) == 0 {
                        if substr(`"`type`i''"', 1, 3) == "str" {
                            local type`i' = "str"
                        }  
                        local types `types' `type`i''
                    }
                    else if c(rc) != 111 { // SKIP IF VARIABLE `s'w`i' DOES NOT EXIST
                        display as error `"Unexpected error processing stub `s'"'
                        exit c(rc)
                    }
                }
                local types: list sort types
                local n_vars: word count `types'
                if `"`:word 1 of `types''"' != `"`:word `n_vars' of `types''"' { // MISMATCH FOUND
                    display `"Stub `s' mismatch: `types'"'
                }
            }
            Last edited by Clyde Schechter; 08 Aug 2023, 15:35. Reason: Corrected a possible situation where string variables of different lengths would be adjudicated as mismatches.

            Comment


            • #7
              This is perfect!!

              Thank you so so very much!!!

              Comment

              Working...
              X