Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stack data?

    Dear All, I have this data set.
    Code:
    clear
    input int A str12 B str9(C D) str18 E str9 F str15 G str9(H I J K) str11 L str9 M str17 N str9(O P)
       . "Keelung City" ""          ""          ""                   ""          "New Taipei City" ""          ""          ""          ""          "Taipei City" ""          ""                  ""          ""         
       . "variable1"    "variable2" "variable3" "variable4"          "variable5" "variable1"       "variable2" "variable3" "variable4" "variable5" "variable1"   "variable2" "variable3"         "variable4" "variable5"
    1991 "439"          "53.25"     "-332.5"    "-718.25"            "-1104"     "704"             "49.31"     "-605.38"   "-1260.07"  "-1914.76"  "28"          "47.11"     "66.22"             "85.33"     "104.44"   
    1992 "509"          "53.81"     "-401.38"   "-856.57" "-1311.76"  "789"             "48.29"     "-692.42"   "-1433.13"  "-2173.84"  "26"          "49.5"      "73"                "96.5"      "120"      
    1993 "469"          "54.5"      "-360"      "-774.5"             "-1189"     "899"             "47.79"     "-803.42"   "-1654.63"  "-2505.84"  "25"          "45.28"     "65.56"             "85.84"     "106.12"   
    1994 "445"          "55.47"     "-334.06"   "-723.59"            "-1113.12"  "914"             "47.85"     "-818.3"    "-1684.45"  "-2550.6"   "27"          "48.41"     "69.81" "91.23"     "112.64"   
    end
    I'd like to stack the data into
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str15 id int(year variable1) double(variable2 variable3 variable4 variable5)
    "Keelung City"    1991 439 53.25  -332.5  -718.25    -1104
    "Keelung City"    1992 509 53.81 -401.38  -856.57 -1311.76
    "Keelung City"    1993 469  54.5    -360   -774.5    -1189
    "Keelung City"    1994 445 55.47 -334.06  -723.59 -1113.12
    "New Taipei City" 1991 704 49.31 -605.38 -1260.07 -1914.76
    "New Taipei City" 1992 789 48.29 -692.42 -1433.13 -2173.84
    "New Taipei City" 1993 899 47.79 -803.42 -1654.63 -2505.84
    "New Taipei City" 1994 914 47.85  -818.3 -1684.45  -2550.6
    "Taipei City"     1991  28 47.11   66.22    85.33   104.44
    "Taipei City"     1992  26  49.5      73     96.5      120
    "Taipei City"     1993  25 45.28   65.56    85.84   106.12
    "Taipei City"     1994  27 48.41   69.81    91.23   112.64
    end
    Any suggestions? Thanks.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    Why not
    Code:
    help reshape long
    ?
    Last edited by Daniel Schaefer; 08 Jul 2022, 17:47. Reason: Im guessing someone with 1776 posts on this forum actually tried reshape first.

    Comment


    • #3
      Dear Daniel, I am aware of -reshape- command, but am not sure whether it can be directly applied to my purpose. You might have a better idea.

      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment


      • #4
        Why can't it be applied? I'm not at my computer so I can't test this, but why wouldn't that be the solution, or a solution?

        Comment


        • #5
          Jared Greathouse I think it may be because of the first two lines of the input data in #1. I think the key to the problem is that we need to rename the variables such that "`cityname'_`varname'" is the variable name. Just consider that cityname is captured by observation 1 while varname is captured by observation 2. So what we need is a for loop over the set of variables, where each iteration of the loop changes the current variable name to "`cityname'_`varname'". Then we drop the first two observations to get a dataset that OP can use -reshape- on.

          I've got some code that doesn't work. I bet I can get it working eventually, but I'm probably not going to finish it tonight. I'd be perfectly happy if you figured this out first.
          Last edited by Daniel Schaefer; 08 Jul 2022, 20:35.

          Comment


          • #6
            River, please find the code below.

            Code:
            local v_l = "B"
            foreach v of varlist B-P {
                replace `v' = `v_l' in 1 if mi(`v'[1])
                local v_l = "`v'"
            }
            
            foreach v of varlist B-P {
                local newname = `v'[2] + "_" + strtoname(`v'[1])
                rename `v' `newname'
            }
            
            drop in 1/2
            
            rename A year
            reshape long variable1 variable2 variable3 variable4 variable5, i(year) j(id) string
            
            replace id = strtrim(subinstr(id, "_", " ", .))
            
            destring variable1-variable5, replace
            sort id year
            order id year

            Comment


            • #7
              Dear Professor Wang, Thanks a lot. It works well.
              Ho-Chuan (River) Huang
              Stata 19.0, MP(4)

              Comment


              • #8
                A slightly different answer to get you to the same dataset

                Code:
                rename A year
                gen `c(obs_t)' row = _n
                
                local stems
                local stemnum 0
                foreach v of varlist B-P {
                  local vnum = real(ustrregexrf(`"`=`v'[2]'"', "variable", "", .))
                  if !mi(`vnum') {
                    if `vnum'==1 {
                      local ++stemnum
                      local stem : word `stemnum' of `c(ALPHA)'
                      local stems `stems' `stem'
                    }
                    rename `v' `stem'`vnum'
                  }
                }
                local ncities : word count `stems'
                
                reshape long `stems' , i(row) j(var)
                
                rename (`stems') (city#), addnumber
                reshape long city, i(row var) j(cityid)
                
                reshape wide city, i(row cityid) j(var)
                rename (city#) (variable#)
                
                tempname cities
                preserve
                keep cityid variable1
                keep in 1/`ncities'
                rename variable1 cityname
                save `cities', replace
                restore
                
                merge m:1 cityid using `cities' , nogen
                drop if inrange(row, 1, 2)
                drop row cityid
                sort cityname year
                order cityname year, first

                Comment


                • #9
                  Dear Leonardo, Many thanks for this helpful suggestion.
                  Ho-Chuan (River) Huang
                  Stata 19.0, MP(4)

                  Comment


                  • #10
                    Code:
                    stack (A B C D E F) (A G H I J K) (A L M N O P) if A != ., into(Y V1 V2 V3 V4 V5) clear
                    
                    recode _stack (1=1 "Keelung") (2 = 2 "New Taipei") (3 = 3 "Taipei"), gen(City)

                    Comment


                    • #11
                      Dear Romalpa, Thanks a lot for this very concise solution.
                      Ho-Chuan (River) Huang
                      Stata 19.0, MP(4)

                      Comment

                      Working...
                      X