Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping data

    Hello,

    I am trying to change the orientation of my data, but do not know an efficient way of going about it. I have data for the prices for a basket of different goods across 10 years and multiple states. The data is currently formatted as pictured below. I want to change it so there is a month variable and each of the price series is its own variable.

    In other words, I would like my data to look like this

    Year Month Price_Series1 Price_Series2 ........

    What is the best way to go about this?


    Thank you!

    Christy
    Attached Files

  • #2
    Welcome to Statalist.

    The following code will reshape your data from its current wide layout to a long layout, where each observation will have price_series, Year, Month, and value.
    Code:
    rename (January-December) (value#), addnumber
    reshape long value, i(price_series year) j(Month)
    Note that in the absence of data (more about that later) this code is untested, but I'm optimistic I have it correct.

    To get what you describe in post #1, you would then have to reshape the price_series into a wide layout. The first problem you would encounter is that the values of price_series are not suitable for use as Stata variable names. The second problem is that it appears you will wind up with a huge number of variables.

    But there is a more fundamental issue. The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data. You should try to achieve what you need with the data organized in the long layout I describe, and seek the help of Statalist in doing so if you encounter problems.

    For example, I notice your price series values consist of three pieces: a product name such as "7-Up 35cl", what appears to be a geographic name, and what seems to be a geographic descriptor, with the underscore character separating the pieces. With the data in a long layout, you will be easily able to separate this into three variables, which it seems to me will be more helpful for your analyses.
    Code:
    . split price_series, parse(_) generate(psd)
    variables created as string:
    psd1  psd2  psd3
    
    . rename (psd1 psd2 psd3) (product place place_type)
    
    . list, abbreviate(12) noobs clean
    
                 price_series     product   place   place_type  
         7-Up 35cl_ABIA_URBAN   7-Up 35cl    ABIA        URBAN  
        7-Up 35cl_ABUJA_URBAN   7-Up 35cl   ABUJA        URBAN
    Finally, let me close with some brief advice on making effective use of Statalist.

    Please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ. A picture of your data is of little use for demonstrating code, and retying data from a picture as I had to is not an exciting prospect.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Comment


    • #3
      Thank you for your help, as a new user, it is very much appreciated. My apologies as I get accustomed to the forum. Your reshape and split codes both ran properly.

      I agree that using the price_series as wide would generate too many variables, but I believe I need to use the reshape wide command in order to generate a variable for each of the products.

      I know I need to use a command along the lines of

      reshape wide itemlabel, i(Year month State sector) j(??)

      I have included a sample of the data

      input str54 itemlabel int Year str12 State str6 sector byte Month str17 value
      "Andrew's Liver Salt" 2001 "ABIA" "URBAN" 2 "10"
      "Biro - Bic" 2001 "ABIA" "URBAN" 2 "10"
      "Biro - Bic" 2001 "ABIA" "URBAN" 10 "10"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 10 "10"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 4 "10"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 2 "10"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 5 "10.05"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 12 "10.09166666666667"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 6 "10.10025"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 7 "10.15075125"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 8 "10.20150500625"
      "Andrew's Liver Salt" 2001 "ABIA" "URBAN" 3 "10.25"
      "Biro - Bic" 2001 "ABIA" "URBAN" 3 "10.25"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 3 "10.25"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 9 "10.25251253128125"
      "Biro - Bic" 2001 "ABIA" "URBAN" 11 "10.4125"
      "Exercise book - 2A" 2001 "ABIA" "URBAN" 11 "10.4125"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 1 "10.53"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 2 "10.58265"
      "Biro - Bic" 2001 "ABIA" "URBAN" 12 "10.75833333333333"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 3 "10.978"
      "Coffee - nescafe" 2001 "ABIA" "URBAN" 2 "100"
      "Coffee - nescafe" 2001 "ABIA" "URBAN" 10 "100"
      "Combantrin - packet of 6 tablets" 2001 "ABIA" "URBAN" 9 "100"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 11 "100"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 12 "100"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 10 "100"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 7 "100"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 9 "100"
      "Margarine - blue band" 2001 "ABIA" "URBAN" 10 "100"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 4 "100"
      "Vegetable oil" 2001 "ABIA" "URBAN" 9 "100"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 2 "100"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 1 "100"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 4 "100"
      "Vegetable oil" 2001 "ABIA" "URBAN" 10 "100.1158333333333"
      "Kerosene - 1 gallon (4.5litres)" 2001 "ABIA" "URBAN" 8 "100.5"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 5 "100.5"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 5 "100.5"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 7 "1000"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 2 "1000"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 8 "1005"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 6 "101.0025"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 6 "101.0025"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 7 "101.5075125"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 7 "101.5075125"
      "Vegetable oil" 2001 "ABIA" "URBAN" 8 "101.7833333333334"
      "Vegetable oil" 2001 "ABIA" "URBAN" 7 "101.9444"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 9 "1010.025"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 10 "1015.075125"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 8 "102.0150500625"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 8 "102.0150500625"
      "Coffee - nescafe" 2001 "ABIA" "URBAN" 3 "102.5"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 3 "102.5"
      "Urine Test - private laboratory" 2001 "ABIA" "URBAN" 9 "102.5251253128125"
      "Weekly Magazine Newswatch" 2001 "ABIA" "URBAN" 9 "102.5251253128125"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 11 "1020.150500625"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 3 "1025"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 12 "1025.251253128125"
      "Vegetable oil" 2001 "ABIA" "URBAN" 11 "103.1928947368421"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 4 "1030.125"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 5 "1035.275625"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 7 "1037.5"
      "Coffee - nescafe" 2001 "ABIA" "URBAN" 11 "104.125"
      "Margarine - blue band" 2001 "ABIA" "URBAN" 11 "104.125"
      "Maize - grain (yellow)" 2001 "ABIA" "URBAN" 4 "104.5454545454546"
      "Rent - Single Room (modern house)" 2001 "ABIA" "URBAN" 6 "1040.452003125"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 8 "1042.6875"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 9 "1047.9009375"
      "Poplin,65% poly & 35% cotton (1yd)" 2001 "ABIA" "URBAN" 4 "105"
      "Maize - grain (yellow)" 2001 "ABIA" "URBAN" 5 "105.0681818181818"
      "Poplin,65% poly & 35% cotton (1yd)" 2001 "ABIA" "URBAN" 5 "105.525"
      "Maize - grain (yellow)" 2001 "ABIA" "URBAN" 6 "105.5935227272727"
      "Poplin,65% poly & 35% cotton (1yd)" 2001 "ABIA" "URBAN" 6 "106.052625"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 4 "1075.25"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 5 "1080.62625"
      "Women's Shoe - Bata" 2001 "ABIA" "URBAN" 6 "1086.02938125"
      "Singlet - acrylic (men)" 2001 "ABIA" "URBAN" 1 "109.45"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 4 "11"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 5 "11.055"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 6 "11.110275"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 8 "11.46428571428571"
      "Biro - Bic" 2001 "ABIA" "URBAN" 4 "11.66666666666667"
      "Biro - Bic" 2001 "ABIA" "URBAN" 5 "11.725"
      "Biro - Bic" 2001 "ABIA" "URBAN" 6 "11.783625"
      "Biro - Bic" 2001 "ABIA" "URBAN" 7 "11.842543125"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 7 "11.86956521739131"
      "Biro - Bic" 2001 "ABIA" "URBAN" 8 "11.901755840625"
      "Egg - agric 1" 2001 "ABIA" "URBAN" 10 "11.91666666666667"
      "Biro - Bic" 2001 "ABIA" "URBAN" 9 "11.96126461982812"
      "Fansidar - sachet of 3 tablets" 2001 "ABIA" "URBAN" 9 "110"
      "Fansidar - sachet of 3 tablets" 2001 "ABIA" "URBAN" 10 "110"
      "Maize - grain (white)" 2001 "ABIA" "URBAN" 4 "110"
      "Singlet - acrylic (men)" 2001 "ABIA" "URBAN" 2 "110"
      "Maize - grain (white)" 2001 "ABIA" "URBAN" 5 "110.55"
      "Maize - grain (white)" 2001 "ABIA" "URBAN" 6 "111.10275"
      "Vegetable oil" 2001 "ABIA" "URBAN" 12 "111.8825"
      "Singlet - acrylic (men)" 2001 "ABIA" "URBAN" 3 "112.75"
      "Fansidar - sachet of 3 tablets" 2001 "ABIA" "URBAN" 11 "113.63"
      "Fansidar - sachet of 3 tablets" 2001 "ABIA" "URBAN" 12 "113.7214285714286"
      end
      [/CODE]

      Thank you,

      Christy

      Comment


      • #4
        Thank you for the sample data.

        This code will I think do what you ask for - I tested it on your data.
        Code:
        encode itemlabel, generate(series)
        label list series
        drop itemlabel
        reshape wide value, i(Year Month State sector) j(series)
        The series are named value1 value2 ... because I see no reliable way of translating your itemlabel values into Stata variable names. I see that itemlabel can be as long as 54 characters, and Stata variable names are limited to 32 characters. I took the easy way out and encoded them into numbers.

        But with that said, you explained

        I believe I need to use the reshape wide command in order to generate a variable for each of the products.
        To be clear about my advice in post #2, I believe it is unlikely that you in fact need a variable for each of the products. I believe you can accomplish what you need to do starting with the data in a long layout.

        Comment

        Working...
        X