Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating missing values within years in a panel data

    Dear all,
    I use Stata13 and I have a panel data of 44 countries (c_id 1 to 44) and 6 variables - ps, va, ccr, rol, rq and ge. All the countries have missing values for all the variables for the following years: 1997, 1999 and 2001. I want to create these values using the average of the pre- and post years.

    For instance, to generate values of ps:
    for year1997 = (ps1996 + ps1998)/2
    for year1999 = (ps1998 + ps2000)/2
    for year2001 = (ps2000 + ps2001)/2

    Is there an easy way to do this in Stata....considering the large sample, it is cumbersome using excel. Any suggestions will do.

    Thank you.

    (I was unable to attach the Stata file, so I just copied some portion of the data)
    c_id country cc year ccr ge ps rol rq va
    1 ANGOLA AGO 1996 -1.16 -0.84 -2.10 -1.63 -1.46 -1.64
    1 ANGOLA AGO 1997
    1 ANGOLA AGO 1998 -1.32 -1.36 -2.34 -1.70 -1.72 -1.44
    1 ANGOLA AGO 1999
    1 ANGOLA AGO 2000 -1.52 -1.46 -2.08 -1.63 -1.83 -1.48
    1 ANGOLA AGO 2001
    1 ANGOLA AGO 2002 -1.14 -1.25 -1.61 -1.61 -1.49 -1.24
    1 ANGOLA AGO 2003 -1.30 -1.14 -1.03 -1.53 -1.23 -1.27
    2 BENIN BEN 1996 -0.93 -0.41 0.96 -0.19 -0.20 0.12
    2 BENIN BEN 1997
    2 BENIN BEN 1998 -0.60 -0.27 0.66 -0.14 -0.17 0.34
    2 BENIN BEN 1999
    2 BENIN BEN 2000 -0.49 -0.27 0.72 -0.27 -0.21 0.36
    2 BENIN BEN 2001
    2 BENIN BEN 2002 -0.79 -0.38 0.74 -0.34 -0.44 -0.00
    3 BOTSWANA BWA 1996 0.59 0.47 0.92 0.50 0.76 0.87
    3 BOTSWANA BWA 1997
    3 BOTSWANA BWA 1998 0.71 0.50 0.87 0.58 0.70 0.73
    3 BOTSWANA BWA 1999
    3 BOTSWANA BWA 2000 0.67 0.51 0.95 0.53 0.64 0.64
    3 BOTSWANA BWA 2001
    4 BURKINA FASO BFA 1996 0.22 -1.02 -0.41 -1.03 -0.25 -0.73
    4 BURKINA FASO BFA 1997
    4 BURKINA FASO BFA 1998 -0.24 -0.75 -0.20 -0.85 -0.39 -0.57
    4 BURKINA FASO BFA 1999
    4 BURKINA FASO BFA 2000 -0.14 -0.62 -0.01 -0.67 -0.12 -0.36
    4 BURKINA FASO BFA 2001
    4 BURKINA FASO BFA 2002 -0.02 -0.72 -0.37 -0.68 -0.14 -0.48
    5 BURUNDI BDI 1996 -1.39 -1.73 -2.24 -1.72 -1.67 -1.75
    5 BURUNDI BDI 1997
    5 BURUNDI BDI 1998 -1.15 -1.66 -2.42 -1.48 -1.59 -1.61
    5 BURUNDI BDI 1999
    5 BURUNDI BDI 2000 -1.01 -1.44 -2.15 -1.56 -1.22 -1.63
    5 BURUNDI BDI 2001
    5 BURUNDI BDI 2002 -0.94 -1.45 -2.39 -1.39 -1.32 -1.22
    6 CAMEROON CMR 1996 -1.16 -1.00 -1.06 -1.47 -1.13 -0.97
    6 CAMEROON CMR 1997
    6 CAMEROON CMR 1998 -1.04 -0.74 -0.82 -1.12 -0.64 -1.14
    6 CAMEROON CMR 1999
    6 CAMEROON CMR 2000 -1.06 -0.67 -0.57 -1.23 -0.59 -1.14
    6 CAMEROON CMR 2001
    6 CAMEROON CMR 2002 -1.09 -0.82 -0.72 -1.25 -0.89 -1.17
    .................................................. .................................................. ...............
    .................................................. .................................................. ...............
    .................................................. .................................................. ..............
    43 ZAMBIA ZMB 1996 -1.03 -1.06 -0.23 -0.65 -0.42 -0.40
    43 ZAMBIA ZMB 1997
    43 ZAMBIA ZMB 1998 -0.88 -0.86 0.09 -0.53 -0.12 -0.60
    43 ZAMBIA ZMB 1999
    43 ZAMBIA ZMB 2000 -0.85 -0.86 -0.06 -0.54 -0.26 -0.49
    43 ZAMBIA ZMB 2001
    43 ZAMBIA ZMB 2002 -0.94 -0.81 -0.35 -0.39 -0.61 -0.37
    44 ZIMBABWE ZWE 1996 -0.25 -0.23 -0.53 -0.82 -0.98 -0.62
    44 ZIMBABWE ZWE 1997
    44 ZIMBABWE ZWE 1998 -0.65 -0.37 -0.88 -0.71 -0.79 -0.82
    44 ZIMBABWE ZWE 1999
    44 ZIMBABWE ZWE 2000 -0.95 -0.78 -1.42 -1.33 -1.46 -1.11
    44 ZIMBABWE ZWE 2001
    44 ZIMBABWE ZWE 2002 -1.21 -0.90 -1.59 -1.59 -1.97 -1.41
    44 ZIMBABWE ZWE 2003 -1.25 -0.95 -1.13 -1.68 -2.00 -1.44
    44 ZIMBABWE ZWE 2004 -1.31 -0.95 -1.26 -1.76 -2.03 -1.55


  • #2
    Try this:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float c_id str16 country str3 cc float(year ccr ge ps rol rq va)
     1 "ANGOLA"   "AGO" 1996 -1.16  -.84  -2.1 -1.63 -1.46 -1.64
     1 "ANGOLA"   "AGO" 1997     .     .     .     .     .     .
     1 "ANGOLA"   "AGO" 1998 -1.32 -1.36 -2.34  -1.7 -1.72 -1.44
     1 "ANGOLA"   "AGO" 1999     .     .     .     .     .     .
     1 "ANGOLA"   "AGO" 2000 -1.52 -1.46 -2.08 -1.63 -1.83 -1.48
     1 "ANGOLA"   "AGO" 2001     .     .     .     .     .     .
     1 "ANGOLA"   "AGO" 2002 -1.14 -1.25 -1.61 -1.61 -1.49 -1.24
     1 "ANGOLA"   "AGO" 2003  -1.3 -1.14 -1.03 -1.53 -1.23 -1.27
     2 "BENIN"    "BEN" 1996  -.93  -.41   .96  -.19   -.2   .12
     2 "BENIN"    "BEN" 1997     .     .     .     .     .     .
     2 "BENIN"    "BEN" 1998   -.6  -.27   .66  -.14  -.17   .34
     2 "BENIN"    "BEN" 1999     .     .     .     .     .     .
     2 "BENIN"    "BEN" 2000  -.49  -.27   .72  -.27  -.21   .36
     2 "BENIN"    "BEN" 2001     .     .     .     .     .     .
     2 "BENIN"    "BEN" 2002  -.79  -.38   .74  -.34  -.44     0
     3 "BOTSWANA" "BWA" 1996   .59   .47   .92    .5   .76   .87
     3 "BOTSWANA" "BWA" 1997     .     .     .     .     .     .
     3 "BOTSWANA" "BWA" 1998   .71    .5   .87   .58    .7   .73
     3 "BOTSWANA" "BWA" 1999     .     .     .     .     .     .
     3 "BOTSWANA" "BWA" 2000   .67   .51   .95   .53   .64   .64
     3 "BOTSWANA" "BWA" 2001     .     .     .     .     .     .
     5 "BURUNDI"  "BDI" 1996 -1.39 -1.73 -2.24 -1.72 -1.67 -1.75
     5 "BURUNDI"  "BDI" 1997     .     .     .     .     .     .
     5 "BURUNDI"  "BDI" 1998 -1.15 -1.66 -2.42 -1.48 -1.59 -1.61
     5 "BURUNDI"  "BDI" 1999     .     .     .     .     .     .
     5 "BURUNDI"  "BDI" 2000 -1.01 -1.44 -2.15 -1.56 -1.22 -1.63
     5 "BURUNDI"  "BDI" 2001     .     .     .     .     .     .
     5 "BURUNDI"  "BDI" 2002  -.94 -1.45 -2.39 -1.39 -1.32 -1.22
     6 "CAMEROON" "CMR" 1996 -1.16    -1 -1.06 -1.47 -1.13  -.97
     6 "CAMEROON" "CMR" 1997     .     .     .     .     .     .
     6 "CAMEROON" "CMR" 1998 -1.04  -.74  -.82 -1.12  -.64 -1.14
     6 "CAMEROON" "CMR" 1999     .     .     .     .     .     .
     6 "CAMEROON" "CMR" 2000 -1.06  -.67  -.57 -1.23  -.59 -1.14
     6 "CAMEROON" "CMR" 2001     .     .     .     .     .     .
     6 "CAMEROON" "CMR" 2002 -1.09  -.82  -.72 -1.25  -.89 -1.17
    43 "ZAMBIA"   "ZMB" 1996 -1.03 -1.06  -.23  -.65  -.42   -.4
    43 "ZAMBIA"   "ZMB" 1997     .     .     .     .     .     .
    43 "ZAMBIA"   "ZMB" 1998  -.88  -.86   .09  -.53  -.12   -.6
    43 "ZAMBIA"   "ZMB" 1999     .     .     .     .     .     .
    43 "ZAMBIA"   "ZMB" 2000  -.85  -.86  -.06  -.54  -.26  -.49
    43 "ZAMBIA"   "ZMB" 2001     .     .     .     .     .     .
    43 "ZAMBIA"   "ZMB" 2002  -.94  -.81  -.35  -.39  -.61  -.37
    44 "ZIMBABWE" "ZWE" 1996  -.25  -.23  -.53  -.82  -.98  -.62
    44 "ZIMBABWE" "ZWE" 1997     .     .     .     .     .     .
    44 "ZIMBABWE" "ZWE" 1998  -.65  -.37  -.88  -.71  -.79  -.82
    44 "ZIMBABWE" "ZWE" 1999     .     .     .     .     .     .
    44 "ZIMBABWE" "ZWE" 2000  -.95  -.78 -1.42 -1.33 -1.46 -1.11
    44 "ZIMBABWE" "ZWE" 2001     .     .     .     .     .     .
    44 "ZIMBABWE" "ZWE" 2002 -1.21   -.9 -1.59 -1.59 -1.97 -1.41
    44 "ZIMBABWE" "ZWE" 2003 -1.25  -.95 -1.13 -1.68    -2 -1.44
    44 "ZIMBABWE" "ZWE" 2004 -1.31  -.95 -1.26 -1.76 -2.03 -1.55
    end
    
    
    // VERIFY AT MOST ONE OBSERVATOIN PER COUNTRY PER YEAR
    isid country year
    xtset c_id year
    foreach v of varlist ccr-va {
        replace `v' = 0.5*(L1.`v' + F1.`v') if inlist(year, 1997, 1999, 2001)
    }
    Note: This code depends critically on (and verifies) the assumption that the data never has two observations for the same year in the same country. If that is wrong, the your proposed actions are, in any case, ill-posed as it would not be clear which of the multiple such observations should be used in the averaging.

    In the future, please use the -dataex- command to post example data. While what you posted was not inordinately difficult to import to Stata to test the code, -dataex-, as used in my response, makes it possible for those who want to help you to create a completely faithful replica of your Stata example with just a simple copy/paste operation. You can get the -dataex- command by running -ssc install dataex-. Then run -help dataex- to read the simple instructions for using it. Use -dataex- whenever you post example data going forward.

    Comment


    • #3
      Alternatively, this is just linear interpolation and -ipolate- can do the job.

      Comment


      • #4
        Hi Clyde, thanks a lot for the command, it works perfectly. I also appreciate the information regarding using "dataex" and I have made the attempt to use it but the output did not turn out well, that is the identifier, country cc and year did not show. only the values are displayed.

        Here's the code I used with the output:

        dataex ccr ge ps rol rq va if inrange(year, 1996, 2015)

        ----------------------- copy starting from the next line -----------------------
        [CODE]
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input double(ccr ge ps rol rq va)
        -1.16 -.84 -2.1 -1.63 -1.46 -1.64
        -1.24 -1.1 -2.22 -1.67 -1.59 -1.54
        -1.32 -1.36 -2.34 -1.7 -1.72 -1.44
        -1.42 -1.41 -2.21 -1.67 -1.77 -1.46
        -1.52 -1.46 -2.08 -1.63 -1.83 -1.48
        -1.33 -1.36 -1.85 -1.62 -1.66 -1.36
        -1.14 -1.25 -1.61 -1.61 -1.49 -1.24
        -1.3 -1.14 -1.03 -1.53 -1.23 -1.27
        -1.28 -1.28 -1.04 -1.46 -1.25 -1.28
        -1.33 -1.17 -.89 -1.44 -1.29 -1.24
        -1.18 -1.35 -.54 -1.29 -1.12 -1.23
        -1.29 -1.23 -.7 -1.39 -1.03 -1.17
        -1.28 -1.07 -.35 -1.39 -1.06 -1.13
        -1.42 -.97 -.37 -1.24 -1.02 -1.13
        -1.32 -1.13 -.22 -1.26 -1.02 -1.12
        -1.34 -1.15 -.36 -1.25 -1.08 -1.12
        -1.29 -1 -.38 -1.27 -.96 -1.07
        -1.33 -1.22 -.39 -1.26 -1.05 -1.11

        How can I get it right?....thanks again!

        Comment


        • #5
          The identifier, country cc and year didn't show because you did not mention them in the command. So you can do:

          Code:
          dataex  c_id country cc year ccr ge ps rol rq va
          And, even simpler, if these are all the variables in your data set you can just type

          Code:
          dataex
          and it will include all variables in the data set.

          Comment


          • #6
            dataex, works perfectly...thanks Clyde!!!!!

            Comment


            • #7
              To follow up on #3 this could be

              Code:
              foreach v in ccr ge ps rol rq va {
                     ipolate `v' year, gen(`v'2) by(country)
              }
              which naturally is equivalent to Clyde's code in #2. ipolate doesn't assume or even use any tsset or xtset information.

              Comment


              • #8
                Thanks a lot for this Nick!

                Comment

                Working...
                X