Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observations in Time Series Data

    Hi,
    I am using economic data from the Penn World table. I have a problem with a simple way of dropping countries with no observations from 1950 - 2011. In some way, I want to work with countries that have observations for the entire of 1950 - 2011 series. Some countries have only observations starting from 1970. How can I just drop this countries without manually going into the data and scrolling through?

    if I implement
    Code:
    keep if year == 1950/2011 & rgdpe!=.
    the entire data is dropped. rgdpe is real gdp expenditure, a variable I intend to use and wont allow it dropped at least for all countries with observations for the entire series. I also tried this,
    Code:
    by country, sort: keep if rgdpe!=.
    which only dropped the missings. All together, I dont get the "balanced" time series data.

    Thanks.

  • #2
    Evidently no value of year is equal to 1950/2011, which is .96966683 to 8 decimal places. The only syntax element that will fit there is an expression, not a numlist. Syntax that is, I imagine, closer to what is wanted is

    Code:
      
     keep if inrange(year, 1950, 2011) & rgdpe != .

    Comment


    • #3
      Thank you. This works the same way as
      Code:
      by country, sort: keep if rgdpe!=.
      simply because all countries have the same equal series from 1950 - 2011 only that for countries whose the observations were not captured up to some later year say 1970, the variables have missing data. Now, from 1970 on wards, there are values for the variables. I want to therefore drop all the countries with missing values for given years since 1950. The Penn world data is publicly available on
      HTML Code:
      http://www.rug.nl/research/ggdc/data/pwt/pwt-8.1
      under the experts data tab.

      Thanks

      Comment


      • #4
        As I understand it, you want all countries that have the full 61 years of data on rgdpe, This will accomplish that:

        Code:
        **make sure all cases have the same 61 years
        keep if inrange(year, 1950, 2011) 
        
        **count the number of non-missing values on rgdpe in a country
        bysort country: egen tot_nomiss=count(rgdpe)
        
        **now drop if the number of non-missing is less than 61
        drop if tot_nomiss<61
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          That is it! Thanks. It works perfectly.

          Comment

          Working...
          X