Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keep variable labels after reshape long

    Hello Statlist-Users,

    I know that there has been written a lot about that topic but non of the solutions worked out for me.

    The question is simple: I want to keep variable labels of a large dataset after reshaping it to long format.

    Problem: my dataset is quite big with more than 3,500 variables in wideformat (most variables go from 1995 - 2011) turning to about 600 variables after the reshape.
    All variables have the form var1_1995, var1_1996, ... , var1_2011; var2_1995,...,var2_2011; and so on.
    Thus, in order to simplify my reshape, I do the following:

    Code:
    unab mylist: *_2011
    
    foreach v of local mylist {
    local stubs `"`stubs' `=substr("`v'",1,length("`v'")-4)'"'
    }
    
    reshape long "`stubs'", i(EF1) j(Jahr)
    The unab command therefore creates a macro extracting the basic form of each variable --> var1_ var2_ var3_ etc.

    Is there a way to also transfer variable labels from wide to long when proceeding like just explained?

    Thanks a lot in advance, I hope somebody can help me out.
    Philipp

  • #2
    Your code could be simplified:

    Code:
    unab stubs: *_2011
    local stubs: subinstr local stubs "2011" "", all
    reshape long `stubs', i(EF1) j(Jahr)
    but what you don't make clear is where the variable labels are supposed to come from!

    1. There are more input variables than output variables here.

    2. If the variable labels are informative about the original variables, they may well need editing first.

    Comment


    • #3
      Thanks for the advice regarding my code, I will try that out.

      Variable labels are defined in the wide format.

      For example: var1 always has the same description over the years, like: var1_1995 has the Label "1995: description var1"
      var1_1996 has the Label "1996: description var1" and so on ... same goes for many of the other variables.
      Some Variables also have the same variable Label over the whole time period without a year being specified in the description.
      So i thougth that there is the possibility, especially for the latter mentioned variables, to extract the variable Label from one year and apply it to the new variable.

      Comment


      • #4
        OK, so try this:

        Code:
        unab stubs: *_2011
        local stubs: subinstr local stubs "2011" "", all
        
        foreach s of local stubs { 
            local vl`s' : variable label `s'2011 
            local vl`s' : subinstr local vl`s' "2011:" "" 
            local vl`s' = trim("`vl`s''") 
        } 
        
        reshape long `stubs', i(EF1) j(Jahr)
        
        foreach s of local stubs { 
            label var `s' "`vl`s''"
        }
        I suspect that this approach is one of those documented, which you mysteriously claim did not work out for you.

        Comment


        • #5
          One way I often do this will be to keep a file with the variable names and labels alone, and when the reshape is done, I will append the reshaped file to the file with the stubs and labels.
          In this case, this is what you can do: Assuming current file is saved as original.dta

          Code:
          use original.dta unab stubs: *_2011 local stubs: subinstr local stubs "2011" "", all reshape long `stubs', i(EF1) j(Jahr) save temp1.dta, replace clear use original.dta keep EF1 Jahr *_2011 drop if EF1!=. //Hopefully this drop all cases append using temp1.dta Kinda of a trick, but it works!

          Comment


          • #6
            Thank you very much Nick. That one works fine.
            In order to clear out the mistery, here is the code I tried, which did not work:

            Code:
                
            unab mylist: *_2011
            
            foreach v of local mylist {
            local stubs `"`stubs' `=substr("`v'",1,length("`v'")-4)'"'
            }
            
            foreach s of local mylist {
             local `s'label = subinstr("`: variable label `s'_2011'", " (2011)", "", 1)
            }
            
            reshape long "`stubs'", i(EF1) j(Jahr)
            
            foreach s of local stubs {
             label var `s' "``s'label'"
             }
            I'd be thankful for an advice what my mistake was here.
            Last edited by Philipp Schrauth; 08 Mar 2016, 06:40.

            Comment


            • #7
              @Saint Fko: Do you mind putting your code into a codebox? Just to have that solution documented in a clear way as well...

              Comment


              • #8
                My bad Philipp... This is the code in a codebox.

                Code:
                use original.dta
                unab stubs: *_2011
                local stubs: subinstr local stubs "2011" "", all
                reshape long `stubs', i(EF1) j(Jahr)
                save temp1.dta, replace
                clear
                
                use original.dta
                keep EF1 Jahr *_2011
                cap drop if EF1!=.
                cap drop if EF1 !="" //Hopefully this or the preceding line drops all cases depending on whether EF1 is numeric or string
                append using temp1.dta
                ​Kinda of a trick, but it works!
                Last edited by Saint Fko; 08 Mar 2016, 07:13.

                Comment

                Working...
                X