Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with panel data - data that is on different columns to be on the same row to stop duplications.

    Dear all,

    I am attempting to manipulate the panel data as time series data using the command:
    ‘tsset countrySCM year’, however I have duplicate data because my variables are associated with a country for the same year. See below for an example:

    Code:
    input str14 country str4 var str38 industry int year str18 currency str9 powercode double value float(euro_V output value_added GrossFCF Labour_C Ee) long countrySCM byte isdup
    "Australia" "LABR" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 591 319.14 . . . 319.14 . 1 2
    "Australia" "GFCF" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 2428 1311.12 . . 1311.12 . . 1 2
    "Australia" "VALU" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 2311 1247.94 . 1247.94 . . . 1 2
    "Australia" "GFCF" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 2467 1332.18 . . 1332.18 . . 1 3
    "Australia" "EMPN" " Agriculture, forestry and fishing [A]" 1971 "Persons" "Millions" 411.709 . . . . . .411709 1 3
    "Australia" "VALU" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 2619 1414.26 . 1414.26 . . . 1 3
    "Australia" "LABR" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 596 321.84 .
    As above, I want all the data for Australia in 1970 to be on the same row. The variables I have generated are all the formatted as long. The countrySCM is the numeric var of country. The isdup variable is part of my attempt to find an answer and can be ignored in this topic.

    Kind regards,

    Lucas
    Last edited by Lucas Cotty; 26 Mar 2020, 05:16.

  • #2
    Welcome to Stata list. You will increase your chances of useful answer by following the FAQ on asking questions provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    You confuse me when you talk about having panel data that you want to treat as timeseries. But as I understand it, you want to take the three different values for VA are associated with 1970 and put them all on one year.


    Code:
    clear 
    input str14 country str4 var str38 industry int year str18 currency str9 powercode double value float(euro_V output value_added GrossFCF Labour_C Ee) long countrySCM byte isdup
    "Australia" "LABR" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 591 319.14 . . . 319.14 . 1 2
    "Australia" "GFCF" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 2428 1311.12 . . 1311.12 . . 1 2
    "Australia" "VALU" " Agriculture, forestry and fishing [A]" 1970 "Australian Dollar" "Millions" 2311 1247.94 . 1247.94 . . . 1 2
    "Australia" "GFCF" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 2467 1332.18 . . 1332.18 . . 1 3
    "Australia" "EMPN" " Agriculture, forestry and fishing [A]" 1971 "Persons" "Millions" 411.709 . . . . . .411709 1 3
    "Australia" "VALU" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 2619 1414.26 . 1414.26 . . . 1 3
    "Australia" "LABR" " Agriculture, forestry and fishing [A]" 1971 "Australian Dollar" "Millions" 596 321.84 .
    end
    keep year var value
    
    bysort year: g first=_n==1
    foreach j in var value {
    g `j'1=`j'[_n+1] if first==1 & year==year[_n+1]
    g `j'2=`j'[_n+2] if first==1 & year==year[_n+2]
    }
    list 
    keep if first==1
    If some years have more than 3 observations, you can just add to the generates in the loop.

    Comment

    Working...
    X