Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace missing values with existing observations

    I tried to replace missing values of my variable with the value of another year. I have a database with lots of companies.
    For instance let's say i have company one:
    Year 1999
    Year 2000
    Year 2001
    and for Year 2001 the value of my variable is 1, for year 1999 and 2000 the value is missing. I want to replace my missing with 1.
    I've got the same pattern for all other companies.

    Thanks in advance.

  • #2
    Hi

    I think is better to provide sample data, so that all can understand what exactly you looking for.

    I guess this code could be a good start

    Code:
    replace Var = Year if Var ==.
    Regards

    Comment


    • #3
      Sofie Rose,

      It is not very clear what you want. Here I assume that for an -id- you want to replace a missing value in one year with a nonmissing value from another year. I used -egen-'s -max()- function to find the largest nonmissing value for the -id-. Is that what you want?

      Code:
      . list, clean noobs
          id   year   var1 
           1   1999      1 
           1   2000      . 
           1   2001      1 
           2   1999      1 
           2   2000      2 
           2   2001      . 
      
      . egen var1x = max(var1) , by(id)
      . replace var1=var1x if missing(var1)
      
      . list, clean noobs
          id   year   var1   var1x 
           1   1999      1       1 
           1   2000      1       1 
           1   2001      1       1 
           2   1999      1       2 
           2   2000      2       2 
           2   2001      2       2

      Comment


      • #4
        Alternatively, you may want to replace the missing value with the most recent nonmissing value for that id. This example does that, and points out that it's not clear what to do if the value for the first year is missing.
        Code:
        . gen newvar = var
        (5 missing values generated)
        
        . bysort id (year): replace newvar = newvar[_n-1] if missing(newvar) & _n>1 
        (4 real changes made)
        
        . list, noobs sepby(id)
        
          +--------------------------+
          | id   year   var   newvar |
          |--------------------------|
          |  1   1999     1        1 |
          |  1   2000     .        1 |
          |  1   2001     .        1 |
          |--------------------------|
          |  2   1999     1        1 |
          |  2   2000     2        2 |
          |  2   2001     .        2 |
          |--------------------------|
          |  3   1999     .        . |
          |  3   2000     3        3 |
          |  3   2001     .        3 |
          +--------------------------+

        Comment

        Working...
        X