Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Taking Mean of panel data

    I am working with panel data stata 16. I have panel of 120 countries and time 1990 to 2020. I want to take mean of 5 years only those countries that have every 5 year values. as in picture 2000 to 2004=2000 and 2005 to 2009= 2005. as you can see in pcture if a single year have missing value the county will drop. I have used

    collapse GDPpercapitaconstant2015US , by(id year)

    it took mean even one or two year data of these five years. for exampl if 2002 , 2003and 2004 have data and other two years have missing it take mean of three years. while i want to take only five years value otherwise drop that country for time of 2000 instead taking mean of three years.
    please help me how i can


    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str66 country double(year GDPpercapitaconstant2015US)
    "" . .
    "Afghanistan" 1960 .
    "Afghanistan" 1961 .
    "Afghanistan" 1962 .
    "Afghanistan" 1963 .
    "Afghanistan" 1964 .
    "Afghanistan" 1965 .
    "Afghanistan" 1966 .
    "Afghanistan" 1967 .
    "Afghanistan" 1968 .
    "Afghanistan" 1969 .
    "Afghanistan" 1970 .
    "Afghanistan" 1971 .
    "Afghanistan" 1972 .
    "Afghanistan" 1973 .
    "Afghanistan" 1974 .
    "Afghanistan" 1975 .
    "Afghanistan" 1976 .
    "Afghanistan" 1977 .
    "Afghanistan" 1978 .
    "Afghanistan" 1979 .
    "Afghanistan" 1980 .
    "Afghanistan" 1981 .
    "Afghanistan" 1982 .
    "Afghanistan" 1983 .
    "Afghanistan" 1984 .
    "Afghanistan" 1985 .
    "Afghanistan" 1986 .
    "Afghanistan" 1987 .
    "Afghanistan" 1988 .
    "Afghanistan" 1989 .
    "Afghanistan" 1990 .
    "Afghanistan" 1991 .
    "Afghanistan" 1992 .
    "Afghanistan" 1993 .
    "Afghanistan" 1994 .
    "Afghanistan" 1995 .
    "Afghanistan" 1996 .
    "Afghanistan" 1997 .
    "Afghanistan" 1998 .
    "Afghanistan" 1999 .
    "Afghanistan" 2000 .
    "Afghanistan" 2001 .
    "Afghanistan" 2002 319.8471072628973
    "Afghanistan" 2003 332.22000973489014
    "Afghanistan" 2004 322.6680094934977
    "Afghanistan" 2005 345.9258138419342
    "Afghanistan" 2006 353.7205970244704
    "Afghanistan" 2007 392.71046559533977
    "Afghanistan" 2008 398.9711165438658
    "Afghanistan" 2009 472.84229163256606
    "Afghanistan" 2010 526.1036756105412
    "Afghanistan" 2011 511.9985148484141
    "Afghanistan" 2012 557.9496921123398
    "Afghanistan" 2013 568.964543504121
    "Afghanistan" 2014 565.1792733139848
    "Afghanistan" 2015 556.0072208614432
    "Afghanistan" 2016 552.9969077170334
    "Afghanistan" 2017 553.3550517127551
    "Afghanistan" 2018 546.7430100785039
    "Afghanistan" 2019 555.1389962551129
    "Afghanistan" 2020 529.7412102849382
    "Afghanistan" 2021 .
    "Albania" 1960 .
    "Albania" 1961 .
    "Albania" 1962 .
    "Albania" 1963 .
    "Albania" 1964 .
    "Albania" 1965 .
    "Albania" 1966 .
    "Albania" 1967 .
    "Albania" 1968 .
    "Albania" 1969 .
    "Albania" 1970 .
    "Albania" 1971 .
    "Albania" 1972 .
    "Albania" 1973 .
    "Albania" 1974 .
    "Albania" 1975 .
    "Albania" 1976 .
    "Albania" 1977 .
    "Albania" 1978 .
    "Albania" 1979 .
    "Albania" 1980 1740.5053825157784
    "Albania" 1981 1804.0103068647295
    "Albania" 1982 1818.3673437801087
    "Albania" 1983 1799.8781451262264
    "Albania" 1984 1740.3472258197303
    "Albania" 1985 1735.2898577457725
    "Albania" 1986 1798.0146087898145
    "Albania" 1987 1748.5782622169356
    "Albania" 1988 1691.5305058908884
    "Albania" 1989 1808.6456043562898
    "Albania" 1990 1606.2960469550408
    "Albania" 1991 1163.4912935769614
    "Albania" 1992 1086.4384867925091
    "Albania" 1993 1197.5805938472638
    "Albania" 1994 1305.0007187137162
    "Albania" 1995 1488.0204913734217
    "Albania" 1996 1633.5515721775157
    end
    [/CODE]
    Last edited by Shahla Akram; 21 Jul 2022, 15:35.

  • #2
    You should find a better way to handle missing data. Dropping nonmissing data is likely to bias your analysis. For the sake of illustration, here is how you would do what you want.

    Code:
    drop if missing(GDP)
    isid country year
    g period= floor(year/5)*5
    bys country period: drop if _N<5
    collapse GDP, by(country period)
    Res.:

    Code:
    . l, sepby(country)
    
         +----------------------------------+
         |     country   period   GDPperc~S |
         |----------------------------------|
      1. | Afghanistan     2005   392.83406 |
      2. | Afghanistan     2010   546.03914 |
      3. | Afghanistan     2015   552.84824 |
         |----------------------------------|
      4. |     Albania     1980   1780.6217 |
      5. |     Albania     1985   1756.4118 |
      6. |     Albania     1990   1271.7614 |
         +----------------------------------+

    Comment


    • #3
      Thanks but my other variable that have already 2000, 2005, 2010 donot show now in my data set


      clear
      input str66 country double(year GDP civ)
      "" . . .
      "Afghanistan" 1950 . .
      "Afghanistan" 1955 . .
      "Afghanistan" 1960 . .
      "Afghanistan" 1961 . .
      "Afghanistan" 1962 . .
      "Afghanistan" 1963 . .
      "Afghanistan" 1964 . .
      "Afghanistan" 1965 . .
      "Afghanistan" 1966 . .
      "Afghanistan" 1967 . .
      "Afghanistan" 1968 . .
      "Afghanistan" 1969 . .
      "Afghanistan" 1970 . .
      "Afghanistan" 1971 . .
      "Afghanistan" 1972 . .
      "Afghanistan" 1973 . .
      "Afghanistan" 1974 . .
      "Afghanistan" 1975 . .
      "Afghanistan" 1976 . .
      "Afghanistan" 1977 . .
      "Afghanistan" 1978 . .
      "Afghanistan" 1979 . .
      "Afghanistan" 1980 . .
      "Afghanistan" 1981 . .
      "Afghanistan" 1982 . .
      "Afghanistan" 1983 . .
      "Afghanistan" 1984 . .
      "Afghanistan" 1985 . .
      "Afghanistan" 1986 . .
      "Afghanistan" 1987 . .
      "Afghanistan" 1988 . .
      "Afghanistan" 1989 . .
      "Afghanistan" 1990 . .
      "Afghanistan" 1991 . .
      "Afghanistan" 1992 . .
      "Afghanistan" 1993 . .
      "Afghanistan" 1994 . .
      "Afghanistan" 1995 . .
      "Afghanistan" 1996 . .
      "Afghanistan" 1997 . .
      "Afghanistan" 1998 . .
      "Afghanistan" 1999 . .
      "Afghanistan" 2000 . .
      "Afghanistan" 2001 . .
      "Afghanistan" 2002 319.8471072628973 .
      "Afghanistan" 2003 332.22000973489014 .
      "Afghanistan" 2004 322.6680094934977 .
      "Afghanistan" 2005 345.9258138419342 .4788421
      "Afghanistan" 2006 353.7205970244704 .
      "Afghanistan" 2007 392.71046559533977 .
      "Afghanistan" 2008 398.9711165438658 .
      "Afghanistan" 2009 472.84229163256606 .
      "Afghanistan" 2010 526.1036756105412 .4783796
      "Afghanistan" 2011 511.9985148484141 .
      "Afghanistan" 2012 557.9496921123398 .
      "Afghanistan" 2013 568.964543504121 .
      "Afghanistan" 2014 565.1792733139848 .
      "Afghanistan" 2015 556.0072208614432 .
      "Afghanistan" 2016 552.9969077170334 .
      "Afghanistan" 2017 553.3550517127551 .
      "Afghanistan" 2018 546.7430100785039 .
      "Afghanistan" 2019 555.1389962551129 .
      "Afghanistan" 2020 529.7412102849382 .
      "Afghanistan" 2021 . .
      "Albania" 1950 . .
      "Albania" 1955 . .
      "Albania" 1960 . .
      "Albania" 1961 . .
      "Albania" 1962 . .
      "Albania" 1963 . .
      "Albania" 1964 . .
      "Albania" 1965 . .
      "Albania" 1966 . .
      "Albania" 1967 . .
      "Albania" 1968 . .
      "Albania" 1969 . .
      "Albania" 1970 . .
      "Albania" 1971 . .
      "Albania" 1972 . .
      "Albania" 1973 . .
      "Albania" 1974 . .
      "Albania" 1975 . .
      "Albania" 1976 . .
      "Albania" 1977 . .
      "Albania" 1978 . .
      "Albania" 1979 . .
      "Albania" 1980 1740.5053825157784 .
      "Albania" 1981 1804.0103068647295 .
      "Albania" 1982 1818.3673437801087 .
      "Albania" 1983 1799.8781451262264 .
      "Albania" 1984 1740.3472258197303 .
      "Albania" 1985 1735.2898577457725 .
      "Albania" 1986 1798.0146087898145 .
      "Albania" 1987 1748.5782622169356 .
      "Albania" 1988 1691.5305058908884 .
      "Albania" 1989 1808.6456043562898 .
      "Albania" 1990 1606.2960469550408 .3723597
      "Albania" 1991 1163.4912935769614 .
      "Albania" 1992 1086.4384867925091 .
      end


      just see
      clear
      input str66 country float period double GDPpercapitaconstant2015US
      "Afghanistan" 2005 392.83405692763523
      "Afghanistan" 2010 546.0391398778802
      "Afghanistan" 2015 552.8482373249697
      "Albania" 1980 1780.6216808213146
      "Albania" 1985 1756.4117677999402
      "Albania" 1990 1271.7614279770983
      "Albania" 1995 1602.27790532028
      "Albania" 2000 2250.9996852615077
      "Albania" 2005 3060.4837569202014
      "Albania" 2010 3725.5914325441654
      "Albania" 2015 4253.806422955755
      "Algeria" 1960 1849.3654227546424
      "Algeria" 1965 2038.7710892524096
      "Algeria" 1970 2435.32627927747
      "Algeria" 1975 2987.4726534305037
      "Algeria" 1980 3287.212797405563
      "Algeria" 1985 3275.550502294207
      "Algeria" 1990 2956.067970450005
      "Algeria" 1995 2928.3814406852734
      "Algeria" 2000 3330.8938070704635
      "Algeria" 2005 3809.604288410782
      "Algeria" 2010 4008.548680504806
      "Algeria" 2015 4172.7758148354915
      "American Samoa" 2005 12389.87745820759
      "American Samoa" 2010 11919.847657842603
      "American Samoa" 2015 11551.846025095621
      "Andorra" 1970 36268.99152887558
      "Andorra" 1975 35703.578070802134
      "Andorra" 1980 31874.95292324249
      "Andorra" 1985 29633.951973802374
      "Andorra" 1990 28682.38324304702
      "Andorra" 1995 30741.951302285415
      "Andorra" 2000 36612.84252671953
      "Andorra" 2005 38750.119060648096
      "Andorra" 2010 34263.25567405288
      "Andorra" 2015 37643.18117121981
      "Angola" 1980 3283.9809827541962
      "Angola" 1985 3250.840281858614
      "Angola" 1990 2497.9177099263375
      "Angola" 1995 2380.2630817841173
      "Angola" 2000 2649.297704946941
      "Angola" 2005 3706.437439282376
      "Angola" 2010 4125.796967180848
      "Angola" 2015 3787.2298097952053
      "Antigua and Barbuda" 1980 7346.606404989077
      "Antigua and Barbuda" 1985 10403.523708266406
      "Antigua and Barbuda" 1990 12175.015335232176
      "Antigua and Barbuda" 1995 12840.96181238307
      "Antigua and Barbuda" 2000 13730.501464290506
      "Antigua and Barbuda" 2005 16628.712025063607
      "Antigua and Barbuda" 2010 13755.677262532587
      "Antigua and Barbuda" 2015 15475.989813702201
      "Argentina" 1960 7385.8731417989375
      "Argentina" 1965 8385.791332917343
      "Argentina" 1970 9659.341382893273
      "Argentina" 1975 9892.318260988799
      "Argentina" 1980 9734.309716539348
      "Argentina" 1985 9137.226769058285
      "Argentina" 1990 9331.014766659675
      "Argentina" 1995 10792.3409497882
      "Argentina" 2000 9967.803679618182
      "Argentina" 2005 12358.195519096347
      "Argentina" 2010 13857.3399779038
      "Argentina" 2015 13312.535503139079
      "Armenia" 1990 1137.130794585059
      "Armenia" 1995 1088.047962613086
      "Armenia" 2000 1649.854631159149
      "Armenia" 2005 2874.829578370625
      "Armenia" 2010 3257.664497709158
      "Armenia" 2015 3894.1656369473626
      "Aruba" 1990 29036.666362762866
      "Aruba" 1995 29490.167768486222
      "Aruba" 2000 29331.388756122054
      "Aruba" 2005 29314.051173773954
      "Aruba" 2010 26551.424069080724
      "Australia" 1960 20424.28751868714
      "Australia" 1965 23737.331284369873
      "Australia" 1970 27539.512268558592
      "Australia" 1975 29240.381193185818
      "Australia" 1980 31288.548219543678
      "Australia" 1985 34515.31456787561
      "Australia" 1990 36960.828408227295
      "Australia" 1995 41574.671171825445
      "Australia" 2000 47197.41173838455
      "Australia" 2005 52076.92584313105
      "Australia" 2010 54987.09566266652
      "Australia" 2015 57797.93508827236
      "Austria" 1960 13006.744609069288
      "Austria" 1965 15678.511843575525
      "Austria" 1970 19912.612692212242
      "Austria" 1975 23389.279612460698
      "Austria" 1980 26063.435821593263
      "Austria" 1985 28660.932385515167
      "Austria" 1990 32224.99829017847
      "Austria" 1995 35531.26876311563
      "Austria" 2000 39626.12605943727
      "Austria" 2005 42962.11654372379
      "Austria" 2010 44175.98755578515
      "Austria" 2015 45385.30206987138
      "Azerbaijan" 1990 2110.4999901419974
      Last edited by Shahla Akram; 21 Jul 2022, 18:02.

      Comment


      • #4
        If you want to retain observations for those periods where the country has less than 5 years of data, but not show a calculated mean for those periods, you can just make a slight change to the code in #2:

        Code:
        drop if missing(GDP)
        isid country year
        g period= floor(year/5)*5
        collapse (mean) GDP (count) N = GDP, by(country period)
        replace GDP = . if N < 5
        Pay attention to the use of N in this code, which is different from _N in #2.

        Comment


        • #5
          Thanks Andrew Musau and Clyde Schechter i need futher help
          1- if change year 2000 to 2004 to 1998 to 2003 how same command work (1996, to 2000 and 1999 to 2003) for example use 2000 observation first year , last year and mid year. Frequncy remain same as five years but years rang change
          2- already five year gap variable remain in this collapse command and i need two data megre
          Thanks
          Last edited by Shahla Akram; 22 Jul 2022, 03:40.

          Comment


          • #6
            I do not think that I follow this. You want to define 5 year averages for 1998-2002 (note: not 2003) instead of 2000-2004, 2003-2007 instead of 2005-2009, and so on? Then you just add 2 to year in the floor function. Consider first:

            Code:
            forval year= 1997/2005{
                di "year `year' is in period `=floor(`year'/5)*5'"
            }
            Res.:

            Code:
            . forval year= 1997/2005{
              2. 
            .     di "year `year' is in period `=floor(`year'/5)*5'"
              3. 
            . }
            year 1997 is in period 1995
            year 1998 is in period 1995
            year 1999 is in period 1995
            year 2000 is in period 2000
            year 2001 is in period 2000
            year 2002 is in period 2000
            year 2003 is in period 2000
            year 2004 is in period 2000
            year 2005 is in period 2005
            and

            Code:
            forval year= 1997/2005{
                di "year `year' is in period `=floor((`year'+2)/5)*5'"
            }
            Res.:

            Code:
            . forval year= 1997/2005{
              2. 
            .     di "year `year' is in period `=floor((`year'+2)/5)*5'"
              3. 
            . }
            year 1997 is in period 1995
            year 1998 is in period 2000
            year 1999 is in period 2000
            year 2000 is in period 2000
            year 2001 is in period 2000
            year 2002 is in period 2000
            year 2003 is in period 2005
            year 2004 is in period 2005
            year 2005 is in period 2005
            If this is not what you are asking, provide a sample of your initial dataset and how you want the final dataset to look like.

            Comment


            • #7
              thanks a lot Andrew Musau

              Comment


              • #8
                See https://www.stata-journal.com/articl...article=dm0095 for a round-up of rounding for binning.

                Comment

                Working...
                X