Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filling variable with data from closest year

    Hi everyone,


    I have a dataset of the following form:
    country year score
    France 2000 .
    France 2001 .
    France 2002 5
    France 2003 5
    France 2004 6
    France 2005 .
    Germany 2000 .
    Germany 2001 9
    Germany 2002 7
    Germany 2003 7
    Germany 2004 .
    Germany 2005 .
    I want to fill in the missing data in the "score" column with the value of the closest year. Thus, for France the "score" values for years 2000/2001 would become 5 and for 2005 it would be come 6. Similarly, for Germany the "score" value for 2000 would become 9 and the value for 2004/2005 would become 7.

    I tried doing:

    Code:
    xtset country year
    bysort country (score): replace score = score[_n-1] if missing(score)
    However, then all the missing values take the number of the earliest year (5 for France and 9 for Germany).

    Does anyone know any way of solving this?

    Thank you very much!

  • #2
    Code:
    sort country year
    by country: replace score = score[_n-1] if mi(score)
    
    gsort country -year
    by country: replace score = score[_n-1] if mi(score)

    Comment


    • #3
      great thank you!!

      Comment


      • #4
        Oyvind Snilsberg 's code will give the right answer if there are no gaps in the data. But if there are gaps in the data there could be a problem. The code for example might replace a value that was from one year earlier by one that was from two or more years later.

        There is another puzzle. What if say the 2020 value was missing but 2019 and 2021 non-missing values are available: which counts as closest (closer)?

        mipolate from SSC -- see https://www.statalist.org/forums/for...-interpolation -- offers a choice of rules here. It uses nearest as option name.

        Plain ipolate covers both cases quite well. If you wanted to insist on integer results, then think up a good rounding rule to adjust.

        Comment

        Working...
        X