Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Data and Variables for a Dynamic Panel Data Analysis

    Hi there,
    I am happy to be new in this forum and currently writing on a Panel Data Analysis for a course in the university.
    Unfortunately I am not that familiar with Stata and hope that you can help me by rearranging my variables.
    My Data_Editor looks the following:
    Click image for larger version

Name:	Stata-Output.png
Views:	1
Size:	1.05 MB
ID:	1669982


    But I want instead of Series_Name as a variable, the variables Population, Energy, GDP and Gross Capital Formation, that it looks like this:

    Click image for larger version

Name:	Excel-Image (2).png
Views:	1
Size:	455.6 KB
ID:	1669983


    How can I reshape my data? Would be very thankful if you could help me.

    Kind regards,

    Tom
    Attached Files

  • #2
    Your series name variable cannot be used in its current form as the source for variable names because it contains characters (.) that are not permitted in variable names. Only letters, digits, and the underscore character are allowed. The column headers you show in columns D and F of the Excel screenshot are inadmissible as Stata variable names. So SeriesCode must first be transformed into strings that can appear as parts of a variable name. The code will probably look like this:
    Code:
    replace SeriesCode = strtoname(SeriesCode)
    reshape wide Value, i(CountryCode) j(Year) string
    rename Value* *
    The above code may require modifications to work in your actual data based on things that cannot be discerned from the screenshots shown. Variable names must also be at most 32 characters in length. If any of the SeriesCode values exceed 27 characters, then attempting to -reshape- starting from a variable name Value will breach that limit. As a first cut, you could rename Value to just v before the -reshape- and replace references to Value in the code with v. If you are still confronted with excessive length, then you will have to use some -replace- commands to shorten some of the values of SeriesCode. Also, the -rename- command can fail if the first character in some value(s) of SeriesCode, after -strtoname()- has been applied, is something other than _, or a letter. This, too, would require changing that (those) value(s) of SeriesCode to something suitable.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Hello Clyde Schechter ,
      I have a similar question but I tried to adapt the code you suggested to Tom to my dataset and I couldn't reshape it. This is a general view of how my dataset looks
      year Province Estimate Value
      1981 Alberta Real GDP 15000
      1981 Alberta Government Expenditures 25000
      1981 Alberta Imports 18000
      1982 Alberta Real GDP 15500
      1982 Alberta Government Expenditures 25500
      1982 Alberta Imports 12000
      1981 Nanuvak Real GDP 7500
      1981 Nanuvak Government Expenditures 8500
      1981 Nanuvak Imports 3000
      1982 Nanuvak Real GDP 8100
      That is just an example of my dataset structure. I can copy the dataex screen but is too long. My dataset is from 1981 to 2019, around 3150 Obs. and posting just a short extract, i.e. 1/30 or even 1/100, does not draw a good picture of my dataset structure, perhaps I do not know how to use correctly dataex in this case.

      I want to convert that dataset structure above exemplified to:
      year Province Real GDP Government Expenditures Imports
      1981 Alberta 15000 25000 18000
      1982 Alberta 15500 25500 12000
      1981 Nanuvak 7500 8500 3000
      1982 Nanuvak 8100 15500 3200

      I tried with this code (following your response to Tom):

      reshape wide value, i(geo) j(year) string

      But I got this error message:

      variable year is numeric
      r(109);

      Thank you very much for any help or suggestion you can provide.

      Rafael.-

      Comment


      • #4
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input int year str7 province str23 estimate int value
        1981 "Alberta" "Real GDP"                15000
        1981 "Alberta" "Government Expenditures" 25000
        1981 "Alberta" "Imports"                 18000
        1982 "Alberta" "Real GDP"                15500
        1982 "Alberta" "Government Expenditures" 25500
        1982 "Alberta" "Imports"                 12000
        1981 "Nanuvak" "Real GDP"                 7500
        1981 "Nanuvak" "Government Expenditures"  8500
        1981 "Nanuvak" "Imports"                  3000
        1982 "Nanuvak" "Real GDP"                 8100
        end
        
        egen which = group(estimate), label 
        su which, meanonly 
        local max = r(max)
        forval j = 1/`max' { 
            local label`j' : label (which) `j' 
        }
        drop estimate 
        reshape wide value, i(year province) j(which)
        forval j = 1/`max' { 
            label var value`j' "`label`j''"
        }
        
        d 
        
        list

        Comment


        • #5
          It worked perfectly... Thank you very much.-

          Comment

          Working...
          X