Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace missing with first non-missing based on condition

    I have a variation on the problem of replacing missings with the first non-missing within group: https://www.stata.com/support/faqs/d...issing-values/

    I have an unbalanced panel with possibly multiple gaps within id. I used fillin to balance it and now want to replace x with either the previous or the next value for the new missing observation. So, I need different rules depending on whether it is a gap in the middle of the observation or at the beginning. For those that have missings at the beginning, I want to use the first following non-missing (which I have also tagged in the MWE below).

    Here is an MWE. Id and t give the panel and time variable, x is the variable of interest which should be replaced if _fillin==1 for the first time with the observation for which sample_enter == 1

    Code:
    clear all
    
    input id t x _fillin sample_enter
    1 1 . 1 0
    1 2 . 1 0 
    1 3 . 1 0 
    1 4 110 0 1
    1 5 110 0 .
    1 6 . 1 .
    1 7 . 1 .
    end
    
    gen x2 = x
    bysort id (t x2): replace x2 = x2[_n+1] if missing(x2) & _fillin==1
    The code does not work. So, in the MWE, the observations in t=1,2,3 should have x = 110 while the second gap should be unaffected by the procedure. Hope it was clear

  • #2
    Henry, below is a solution for the case that the missings appear at the beginning, and the missings are replaced by the first non-missing value at later t's.

    Code:
    gsort id -t
    by id: replace x = x[_n-1] if mi(x)
    sort id t

    Comment

    Working...
    X