Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I reshape my dataset with three identifiers (time, location, and index) when it's already wide?

    Hello Statalists,
    This is an air quality panel data and it looks like this:
    Click image for larger version

Name:	2020-01-19 23.27.19.png
Views:	1
Size:	277.8 KB
ID:	1532584

    The variable names in Chinese indicate different locations. However, I need type as variables on top, and locations listed as a column. I read the examples of reshape command and did not find a similar case. How should I do this? Thanks!
    Best,
    Sony Tian

  • #2
    This requires two -reshape- operations. First you have to -reshape long- to get the location as a variable, and then follow that with a -reshape wide- to get the type to be a series of variables.

    But why do you want to do that? Why not just stop after the first -reshape long- with a fully long data set where both type and location are both variables. That is the layout that is most useful for most analysis in Stata. Do you have some specific tasks to do that require the types to be separate variables? If not, you're just making more work for yourself.

    Posting example data as a screenshot is not very helpful. It doesn't provide metadata, and there is no way to import the data from a screenshot into Stata to develop code. In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    If you need more concrete help with these -reshape- steps, post back with example data given by -dataex-.

    Comment


    • #3
      Hello Clyde,
      Thank your for the reply. I was thinking about merging this dataset with another, but -merge- will do its job. You're right, I just need to do the first -reshape long-. This is the output of -dataex- (this is the first I post, not sure if it works):
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long date byte hour str9 type float(东四 天坛 官园 万寿西宫 奥体中心 农展馆 万柳 北部新区 植物园 丰台花园 云岗 古城 房山 大兴 亦庄 通州 顺义 昌平) double 门头沟
      20131205 0 "PM2.5"       .   .   .   .   .   .   .   .  .   .   .   .   .   .   .   .   .   .   .
      20131205 0 "PM2.5_24h"   .   .   .   .   .   .   .   .  .   .   .   .   .   .   .   .   .   .   .
      20131205 0 "PM10"        .   .   .   .   .   .   .   .  .   .   .   .   .   .   .   .   .   .   .
      20131205 0 "PM10_24h"    .   .   .   .   .   .   .   .  .   .   .   .   .   .   .   .   .   .   .
      20131205 0 "AQI"       126 145 134 167 113 123 155 147 72 219 143 145 216 284 229 242  73  74 102
      20131205 1 "PM2.5"      93  93  63  79  59 108  85 169 49  72  90 109  80 128 205 190  70  80  59
      20131205 1 "PM2.5_24h"  93 108  99 123  82  92 115 113 53 163 109 113 164 231 177 190  50  56  77
      20131205 1 "PM10"      103 124  81 107  66 148   . 153 54  79 105 118   . 139 221 190  78  85  70
      20131205 1 "PM10_24h"   97 130 122 141 105 115   . 104 58 207 121 136   . 227 182 198  56  64 112
      20131205 1 "AQI"       123 141 130 161 109 121 150 147 73 213 143 147 214 281 227 240  69  76 103
      20131205 2 "PM2.5"     165 148 156 149  80 164 111 170 45 147  77  80 121 201 166 145  90  94  59
      20131205 2 "PM2.5_24h"  93 108 100 121  79  92 115 112 54 160 109 115 161 230 173 187  49  58  78
      20131205 2 "PM10"      197 183 191 236 112 232   . 179 52 178  98  93   . 242 173 150 102 103  84
      20131205 2 "PM10_24h"   98 130 124 142 102 117   . 105 60 203 121 138   . 228 180 194  55  66 112
      20131205 2 "AQI"       123 141 131 159 105 122 150 147 74 210 142 149 211 280 223 237  68  78 104
      20131205 3 "PM2.5"     133 118 138 119 121 141 102 147 79 151 105  94 169 134 119 111  44  74  64
      20131205 3 "PM2.5_24h"  92 106 101 118  78  92 116 112 56 159 110 117 160 225 166 181  49  59  79
      20131205 3 "PM10"      142 146 163 147 163 190   . 115 96 165 129  94   . 134 129 120  49  66  86
      20131205 3 "PM10_24h"   98 130 126 139 101 118   . 106 63 200 123 138   . 224 176 189  55  66 114
      20131205 3 "AQI"       121 139 133 154 103 121 151 146 76 209 143 152 210 275 216 231  67  80 105
      20131205 4 "PM2.5"      85  88 115 101  82 106 100 130 89 147  96  84 139 109 121  96  17  73  64
      20131205 4 "PM2.5_24h"  90 103 101 115  75  90 115 112 59 160 110 118 158 220 160 172  47  60  80
      20131205 4 "PM10"       78 106 122  93 111 124   . 105 83 157 112  98   . 106 133 100  15  67  85
      20131205 4 "PM10_24h"   95 127 125 134  99 116   . 106 65 194 125 138   . 220 172 180  54  67 115
      20131205 4 "AQI"       118 135 133 150 100 118 150 146 80 210 144 154 208 270 210 222  65  81 107
      20131205 5 "PM2.5"      29  27  45  46  42  30  98  81 56  76  86  70  97  71  58  40  12   7  61
      20131205 5 "PM2.5_24h"  85  98  99 109  72  85 116 110 60 157 112 118 157 214 151 162  46  58  82
      20131205 5 "PM10"       31  30  45  26  47  22   .  40 43 109  76  61   .  67  61  46   6   6  79
      20131205 5 "PM10_24h"   91 122 122 128  95 110   . 104 65 189 127 137   . 215 166 170  53  65 117
      20131205 5 "AQI"       112 129 130 143  97 113 151 143 81 207 146 154 207 264 201 212  64  79 108
      20131205 6 "PM2.5"      24  24  46  20  30  21  64  30 27  52  71  49  51  32  22  13   5   6  50
      20131205 6 "PM2.5_24h"  80  93  96 105  69  82 115 106 59 155 114 117 155 207 144 148  45  56  82
      20131205 6 "PM10"       20  31  59  17  29  18   .  13 33  50  84  44   .  29  22  32  10   8  63
      20131205 6 "PM10_24h"   86 117 119 123  91 106   . 102 65 181 129 135   . 208 161 159  52  62 117
      20131205 6 "AQI"       106 123 127 137  93 108 150 139 80 205 148 153 205 257 191 198  63  76 109
      20131205 7 "PM2.5"      17  24  23  24  18  17  29  19  8  36  44  19  59  34  17  16   3   6  30
      20131205 7 "PM2.5_24h"  76  89  93 102  66  79 112 103 58 151 114 116 153 201 137 139  44  54  82
      20131205 7 "PM10"       16  30  41   9  18  27   .   8 26  31  59  19   .  31  21  26  10   7  42
      20131205 7 "PM10_24h"   82 113 115 120  87 103   .  99 65 172 130 134   . 202 155 151  51  60 117
      20131205 7 "AQI"       101 118 123 133  89 104 146 135 79 201 148 151 203 251 181 184  62  74 109
      20131205 8 "PM2.5"      13  13  13  11  11  26  17  20 21  16  40  16  56  15  17  13   7  11  30
      20131205 8 "PM2.5_24h"  73  86  90  98  64  76 108 100 57 146 113 114 149 193 132 132  44  53  82
      20131205 8 "PM10"       22  31  34   6  18  42   .  16 23  33  65  30   .  25  27  33   7  12  53
      20131205 8 "PM10_24h"   79 109 111 114  84  99   .  97 64 163 131 131   . 194 151 144  50  58 116
      20131205 8 "AQI"        98 113 119 128  86 101 141 131 78 194 148 149 199 243 174 175  61  72 109
      20131205 9 "PM2.5"      18  14  10  10   6  23  16  32 21   9  70  20  27  15  18  15   7   9  30
      20131205 9 "PM2.5_24h"  71  81  87  95  61  74 103  98 55 140 114 111 145 184 126 128  42  51  82
      20131205 9 "PM10"       25  19  63   6   9  27   .  27 25  37 105  34   .  39  32  37  10  10  60
      20131205 9 "PM10_24h"   77 103 109 111  80  97   .  94 62 154 133 127   . 185 146 140  48  56 116
      20131205 9 "AQI"        95 108 115 124  83  99 135 128 75 186 149 145 193 234 166 169  59  70 109
      end

      Comment


      • #4
        Good. So the main issue now is that the various locations need to be renamed so that they have some common start, and from there -reshape- applies directly. I don't know what the numbers in your data represent, so I have just called them var, but if they are, for example, concentrations you might choose conc or something more mnemonic than just var.

        Code:
        isid date hour type
        ds date hour type, not
        rename (`r(varlist)') var=
        reshape long var, i(date hour type) j(location) string
        Note: this code assumes that date hour and type uniquely identify observations in the data (and it will break at the first line if this is not the case). It also assumes that all of the variables in the data set other than that are names of locations. If these assumptions are incorrect, please post back with a new data example illustrating these features that I did not anticipate.

        Comment


        • #5
          The data set meets the assumptions and your code perfectly solved my problem! Thank you Clyde!

          Comment

          Working...
          X