Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to flag consecutive observations that take a particular value?

    Hi all, please consider the following example data

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 studentid float acadyr str1 postcode float(school open flag_repeat flag_repeat_exp)
    "123" 2010 "A" 111 1 1 1
    "123" 2011 "A" 111 1 0 0
    "123" 2012 "B" 111 0 1 0
    "123" 2013 "C" 111 0 0 0
    "123" 2014 "A" 111 1 1 1
    "123" 2015 "D" 111 1 0 0
    "124" 2012 "S" 111 0 0 0
    "124" 2013 "C" 111 1 1 1
    "124" 2015 "C" 112 1 1 1
    "124" 2016 "C" 112 1 0 0
    "124" 2017 "S" 111 0 0 0
    "125" 2007 "A" 111 1 1 1
    "125" 2008 "A" 112 1 1 1
    "125" 2009 "A" 111 1 0 0
    "126" 2012 "S" 111 0 1 0
    "126" 2014 "B" 112 0 1 0
    "126" 2015 "C" 112 0 0 0
    "126" 2016 "D" 111 1 1 1
    "126" 2017 "A" 112 1 1 1
    "126" 2018 "A" 114 1 0 0
    end


    I want to flag consecutive observations that have open==1. Accordingly I have tried

    Code:
    sort studentid acadyr
    gen flag_repeat=0
    by studentid: replace flag_repeat=1 if open[_n]==open[_n+1]==1
    But it appears only
    Code:
    by studentid: replace flag_repeat=1 if open[_n]==open[_n+1]
    is being run/read/considered, as any consecutive occurrences in open are being flagged, not just open==1. I want my flag variable to look like flag_repeat_exp where consecutive open==1 are flagged, but not sure how to work that in.

    Appreciate your suggestions.

  • #2
    You are thinking mathematically that

    x = y = z

    is true or false as a whole. But Stata doesn't treat
    Code:
     
     open[_n]==open[_n+1]==1
    as a whole, but one operator at a time, as if you typed
    Code:
      
     (open[_n]==open[_n+1])==1
    which isn't usually what anyone wants. In your case you want to select
    Code:
    open == open[_n+1]
    as a necessary condition and when that is true it evaluates to 1, which in turn is equal to 1, which takes you no further forward. But it's not a sufficient condition as
    open in either observation could be 0. See the code below for one solution.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 studentid float acadyr str1 postcode float(school open flag_repeat flag_repeat_exp)
    "123" 2010 "A" 111 1 1 1
    "123" 2011 "A" 111 1 0 0
    "123" 2012 "B" 111 0 1 0
    "123" 2013 "C" 111 0 0 0
    "123" 2014 "A" 111 1 1 1
    "123" 2015 "D" 111 1 0 0
    "124" 2012 "S" 111 0 0 0
    "124" 2013 "C" 111 1 1 1
    "124" 2015 "C" 112 1 1 1
    "124" 2016 "C" 112 1 0 0
    "124" 2017 "S" 111 0 0 0
    "125" 2007 "A" 111 1 1 1
    "125" 2008 "A" 112 1 1 1
    "125" 2009 "A" 111 1 0 0
    "126" 2012 "S" 111 0 1 0
    "126" 2014 "B" 112 0 1 0
    "126" 2015 "C" 112 0 0 0
    "126" 2016 "D" 111 1 1 1
    "126" 2017 "A" 112 1 1 1
    "126" 2018 "A" 114 1 0 0
    end
    
    bysort studentid (acadyr) : gen wanted = open == 1 & open[_n+1] == 1 
    
    list, sepby(studentid)
    
         +-----------------------------------------------------------------------------+
         | studen~d   acadyr   postcode   school   open   flag_r~t   flag_r~p   wanted |
         |-----------------------------------------------------------------------------|
      1. |      123     2010          A      111      1          1          1        1 |
      2. |      123     2011          A      111      1          0          0        0 |
      3. |      123     2012          B      111      0          1          0        0 |
      4. |      123     2013          C      111      0          0          0        0 |
      5. |      123     2014          A      111      1          1          1        1 |
      6. |      123     2015          D      111      1          0          0        0 |
         |-----------------------------------------------------------------------------|
      7. |      124     2012          S      111      0          0          0        0 |
      8. |      124     2013          C      111      1          1          1        1 |
      9. |      124     2015          C      112      1          1          1        1 |
     10. |      124     2016          C      112      1          0          0        0 |
     11. |      124     2017          S      111      0          0          0        0 |
         |-----------------------------------------------------------------------------|
     12. |      125     2007          A      111      1          1          1        1 |
     13. |      125     2008          A      112      1          1          1        1 |
     14. |      125     2009          A      111      1          0          0        0 |
         |-----------------------------------------------------------------------------|
     15. |      126     2012          S      111      0          1          0        0 |
     16. |      126     2014          B      112      0          1          0        0 |
     17. |      126     2015          C      112      0          0          0        0 |
     18. |      126     2016          D      111      1          1          1        1 |
     19. |      126     2017          A      112      1          1          1        1 |
     20. |      126     2018          A      114      1          0          0        0 |
         +-----------------------------------------------------------------------------+

    Comment


    • #3
      Thanks Nick! This was an interesting learning!

      Comment

      Working...
      X