Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • define a dummy

    Dear all, I am asked this question. The data set is
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(id year x y)
    1 2011 . 5
    1 2012 5 5
    1 2013 . 3
    1 2014 7 1
    1 2015 1 5
    2 2011 1 .
    2 2012 . 4
    2 2013 2 1
    2 2014 8 .
    2 2015 4 3
    3 2011 2 5
    3 2012 5 4
    3 2013 . 1
    3 2014 5 .
    3 2015 . 5
    end
    For each id, if the value of y (do not include missing values) appears in the x (in all years), then a dummy (say, dum) is equal to 1, 0 otherwise. For instance, id=1, year=2011, y=5, which is equal to x=5 in year 2012, then dum=1. Of course, it is clear that dum=1 when id=1 and year=2012. Similarly, id=1, year=2014, y=1 and is equal to x=1 in year 2015. Thus dum=1. Any suggestions?
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    If not for the missing values this would be a one-liner. But the missing values interfere with the way -rangejoin- works, so first we have to get rid of them. I have used the codes -999 and -888 for missing values of x and y, respectively. You can actually pick any numbers you like provided they don't occur as actual values of x and y.

    Code:
    mvencode x, mv(-999)
    mvencode y, mv(-888)
    
    rangestat (count) dum = year, by(id) interval(x y y)
    replace dum = 0 if missing(dum)
    replace dum = . if y == -888
    replace dum = min(1, dum) if !missing(dum)
    mvdecode y, mv(-888)
    mvdecode x, mv(-999)

    Comment


    • #3
      Dear Clyde, Thanks a lot for the helpful suggestion.
      Ho-Chuan (River) Huang
      Stata 19.0, MP(4)

      Comment

      Working...
      X