Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transpose row values into columns

    Dear Stata Experts,

    i have a dataset that looks like this:

    Click image for larger version

Name:	Unbenannt3.PNG
Views:	1
Size:	12.6 KB
ID:	1695883


    What i am trying to achieve is the following structure:
    Size State of construction Series type Product name Product family Pumping speed max Hz Units PumpModel (column headers)
    3 B RH RH0003B ZEBRA 2.4 50 m3/h RH0003B
    3 B RH RH0003B ZEBRA 2.9 60 m3/h RH0003B
    10 B RH RH0010B ZEBRA 8.1 50 m3/h RH0010BA
    10 B RH RH0010B ZEBRA 9.7 60 m3/h RH0010BA
    10 B RH RH0010B ZEBRA 8.1 50 m3/h RH0010BB
    10 B RH RH0010B ZEBRA 9.7 60 m3/h RH0010BB
    Basically i want to make row names of variable Name to columns, while preserving everything else + adding a new column of existing columns of ZEBRA.

    I have tried to use reshape wide to long function, but it seems to produce some awkward results.

    Any ideas?

    Sincerely,

    Pavlo


  • #2
    To let users here to help more easily, please refer to the FAQ (https://www.statalist.org/forums/help) part 12 on how to post sample data using dataex. Screenshot is next to useless.

    I guess this should get the same sample data set, run this and post the results that appear in the Stata screen, from CODE to /CODE:
    Code:
    dataex Name Hz Units ZEBRARH0003B ZEBRARH0010BA ZEBRAH0010BB in 1/7

    Comment


    • #3
      Thanks for the tip Ken, i always struggled with that

      Here is the original dataset guys:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str54 Name str2 Hz str10 Units str49(ZEBRARH0003B ZEBRARH0010BA ZEBRARH0010BB)
      "Size"                  ""   ""      "3"         "10"        "10"      
      "State of construction" ""   ""      "B"         "B"         "B"        
      "Series type"           ""   ""      "RH"        "RH"        "RH"      
      "Product name"          ""   ""      "RH 0003 B" "RH 0010 B" "RH 0010 B"
      "Product family"        ""   ""      "ZEBRA"     "ZEBRA"     "ZEBRA"    
      "Pumping speed max"     "50" "m³/h" "2.4"       "8.1"       "8.1"      
      "Pumping speed max"     "60" "m³/h" "2.9"       "9.7"       "9.7"      
      end
      Please note that ZEBRA columns are 3 only for this example, there are over 50 in total

      Comment


      • #4
        There is long data mixed together with wide data... so I'd probably process them separately. This may help you think up an approach:

        Code:
        gen seq = _n
        reshape long ZEBRA, i(seq) j(test, string)
        
        preserve
        keep if Name == "Pumping speed max"
        keep test Hz Units ZEBRA
        rename ZEBRA pump_speed_max
        save f01, replace
        restore
        
        keep if Name != "Pumping speed max"
        drop Hz Units
        encode Name, gen(name_n)
        reshape wide ZEBRA, i(seq test) j(name_n)
        collapse (firstnm) ZEBRA*, by(test)
        rename ZEBRA1 product_family
        rename ZEBRA2 product_name
        rename ZEBRA3 series_type
        rename ZEBRA4 size
        rename ZEBRA5 state_construction
        
        merge 1:m test using f01, nogen
        rename test pump_model

        Comment


        • #5
          You are a genius Ken, works as it supposed to. Enorme thanks to you for your time.

          Comment

          Working...
          X