Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Removing non-consecutive years from the dataset

    Hi all,

    I have a historical GDP dataset for more than 180 countries. However, time range is different for each country and the years are not consecutive. A sample of my data is as below:
    Code:
    clear
    input str9 country float(year gdppc)
    "argentina" 1820 100
    "argentina" 1850 120
    "argentina" 1871 125
    "argentina" 1872 130
    "argentina" 1873 150
    "argentina" 1874 160
    "argentina" 1875 180
    "uruguay"   1900 175
    "uruguay"   1910  80
    "uruguay"   1911 100
    "uruguay"   1912 120
    "uruguay"   1913 122
    "uruguay"   1914 125
    "uruguay"   1915 135
    "bolivia"   1920   15
    "bolivia"   1924   20
    "bolivia"   1925   25
    "bolivia"   1926   28
    "bolivia"   1927   30
    "bolivia"   1928   35
    "bolivia"   1929   40
    end
    What code helps me to remove the non-consecutive obs from the dataset? After using the code, Argentina will start from 1871, while Uruguay will start from 1910 and Bolivia will start from 1924.

    Many thanks in advance,

  • #2
    Assuming the rest of your data is also sorted after country and year, this should do the trick:
    Code:
    drop if country==country[_n+1] & year+1!=year[_n+1]
    Last edited by Emil Alnor; 09 Aug 2023, 07:02. Reason: spelling

    Comment


    • #3
      Works like a charm! Thanks Emil Alnor

      Comment

      Working...
      X