Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reshaping long-long (?) data (all variables and observations in the single column)

    Dear all, I have data from the IEA that looks like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 country str7 product int year str8 flow str5 gas str13 value
    "USA"    "COAL" 2010 "ELECHEAT" "CO2"   "1857819.3000"
    "USA"    "COAL" 2010 "ELECHEAT" "CO2eq" "1866946.1000"
    "USA"    "COAL" 2011 "ELECHEAT" "CO2"   "1751542.3000"
    "USA"    "COAL" 2011 "ELECHEAT" "CO2eq" "1760135.5000"
    "USA"    "OIL"  2010 "RESIDENT" "CO2"   "63180.8000"  
    "USA"    "OIL"  2010 "RESIDENT" "CO2eq" "63485.4000"  
    "USA"    "OIL"  2010 "ELECHEAT" "CO2"   "36543.3000"  
    "USA"    "OIL"  2010 "ELECHEAT" "CO2eq" "36646.1000"  
    "USA"    "OIL"  2011 "RESIDENT" "CO2"   "50663.4000"  
    "USA"    "OIL"  2011 "RESIDENT" "CO2eq" "50908.5000"  
    "USA"    "OIL"  2011 "ELECHEAT" "CO2"   "29702.7000"  
    "USA"    "OIL"  2011 "ELECHEAT" "CO2eq" "29783.6000"  
    "RUSSIA" "COAL" 2010 "RESIDENT" "CO2"   "6170.2000"   
    "RUSSIA" "COAL" 2010 "RESIDENT" "CO2eq" "6685.4000"   
    "RUSSIA" "COAL" 2010 "ELECHEAT" "CO2"   "312644.4000"
    "RUSSIA" "COAL" 2010 "ELECHEAT" "CO2eq" "313959.6000"
    "RUSSIA" "COAL" 2011 "RESIDENT" "CO2"   "6189.5000"   
    "RUSSIA" "COAL" 2011 "RESIDENT" "CO2eq" "6706.2000"   
    "RUSSIA" "COAL" 2011 "ELECHEAT" "CO2"   "322049.8000"
    "RUSSIA" "COAL" 2011 "ELECHEAT" "CO2eq" "323410.5000"
    "RUSSIA" "OIL"  2010 "RESIDENT" "CO2"   "13141.3000"  
    "RUSSIA" "OIL"  2010 "RESIDENT" "CO2eq" "13188.5000"  
    "RUSSIA" "OIL"  2010 "ELECHEAT" "CO2"   "35918.7000"  
    "RUSSIA" "OIL"  2010 "ELECHEAT" "CO2eq" "36032.8000"  
    "RUSSIA" "OIL"  2011 "RESIDENT" "CO2"   "15983.3000"  
    "RUSSIA" "OIL"  2011 "RESIDENT" "CO2eq" "16038.9000"  
    "RUSSIA" "OIL"  2011 "ELECHEAT" "CO2"   "53819.4000"  
    "RUSSIA" "OIL"  2011 "ELECHEAT" "CO2eq" "53992.1000"  
    end


    These are emissions for two countries (USA, RUSSIA) in two years (2010, 2011), for fuels (product: COAL, OIL), two sectors of the economy (flow: ELECHEAT, RESIDENT), two pollutants (gas: CO2, CO2eq).

    I want to reorganize data like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 country int year double(ELECHEAT_COAL_CO2 ELECHEAT_COAL_CO2eq ELECHEAT_OIL_CO2 ELECHEAT_OIL_CO2eq RESIDENT_COAL_CO2 RESIDENT_COAL_CO2eq RESIDENT_OIL_CO2 RESIDENT_OIL_CO2eq)
    "USA"    2010 1857819.3 1866946.1 36543.3 36646.1 63180.8 63485.4 260628.3 261347.5
    "USA"    2011 1751542.3 1760135.5 29702.7 29783.6 50663.4 50908.5 256636.8 257344.9
    "RUSSIA" 2010  312644.4  313959.6 35918.7 36032.8  6170.2  6685.4  13141.3  13188.5
    "RUSSIA" 2011  322049.8  323410.5 53819.4 53992.1  6189.5  6706.2  15983.3  16038.9
    end
    and this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str7 A str19 B double(C D)
    "country" "flow"                     2010      2011
    "USA"     "ELECHEAT_COAL_CO2"   1857819.3 1751542.3
    "USA"     "ELECHEAT_COAL_CO2eq" 1866946.1 1760135.5
    "USA"     "ELECHEAT_OIL_CO2"      36543.3   29702.7
    "USA"     "ELECHEAT_OIL_CO2eq"    36646.1   29783.6
    "USA"     "RESIDENT_COAL_CO2"     63180.8   50663.4
    "USA"     "RESIDENT_COAL_CO2eq"   63485.4   50908.5
    "USA"     "RESIDENT_OIL_CO2"     260628.3  256636.8
    "USA"     "RESIDENT_OIL_CO2eq"   261347.5  257344.9
    "RUSSIA"  "ELECHEAT_COAL_CO2"    312644.4  322049.8
    "RUSSIA"  "ELECHEAT_COAL_CO2eq"  313959.6  323410.5
    "RUSSIA"  "ELECHEAT_OIL_CO2"      35918.7   53819.4
    "RUSSIA"  "ELECHEAT_OIL_CO2eq"    36032.8   53992.1
    "RUSSIA"  "RESIDENT_COAL_CO2"      6170.2    6189.5
    "RUSSIA"  "RESIDENT_COAL_CO2eq"    6685.4    6706.2
    "RUSSIA"  "RESIDENT_OIL_CO2"      13141.3   15983.3
    "RUSSIA"  "RESIDENT_OIL_CO2eq"    13188.5   16038.9
    end
    I made these two reshaped datasets by hands in Excel for illustration.
    But I have a lot of products, flows, countries, etc. It is impossible to do it for all data I need by hand.
    I can't handle with reshape command for this

    Could you help me, please?

  • #2
    former,
    Code:
    gen group = flow + "_" + product + "_" + gas
    drop flow product gas
    reshape wide value ,i(country year) j(group) s
    latter,
    Code:
    gen group = flow + "_" + product + "_" + gas
    drop flow product gas
    reshape wide value ,i(country group) j(year)
    Last edited by Øyvind Snilsberg; 29 Nov 2021, 07:09.

    Comment


    • #3
      Øyvind, thanks a lot!

      Comment

      Working...
      X