How to find 2 consecutive values for a dataset with repeated observations

May Blake

Join Date: Feb 2020

Posts: 137
#1

How to find 2 consecutive values for a dataset with repeated observations

16 Feb 2020, 21:39

Hello,

I have the following data below with id, result and results-date. I would like to tag by id and date, observations that have at least 2 consecutive values of >=20 and have this be "1" in the new variable. If an observation does not have 2 consecutive values, I would like this tagged "0" . So for example, id 1 in my example dataset below has a 20 and immediately followed by a 20, so in the new variable, id 1 will be tagged a 1. I used the following code below but I am not getting this at all.

[CODE]

egen tag = tag(id resultdate)
egen ngood = total(tag * (result >= 20)) , by(id)

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte(id result) int resultdate 1 20 20089 1 20 20211 1 35 20226 1 28 20227 1 33 20241 1 32 20281 2 12 20089 2 11 20211 2 13 20226 2 12 20241 2 12 20270 2 12 20298 3 23 20089 3 25 20211 3 28 20226 4 31 20089 4 12 20211 4 14 20226 4 27 20227 4 10 20241 5 20 20089 5 12 20211 5 13 20226 5 16 20241 end format %tdnn/dd/CCYY resultdate

Last edited by May Blake; 16 Feb 2020, 21:43.
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30117

16 Feb 2020, 23:33

Code:

by id (resultdate), sort: gen wanted = min(result, result[_n-1]) >= 20 & _n > 1
by id (wanted), sort: replace wanted = wanted[_N]

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35721

17 Feb 2020, 02:15

Clyde Schechter answered the question very nicely. For a broader context see tsspell from SSC and https://www.stata-journal.com/articl...article=dm0029

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(id result) int resultdate
1 20 20089
1 20 20211
1 35 20226
1 28 20227
1 33 20241
1 32 20281
2 12 20089
2 11 20211
2 13 20226
2 12 20241
2 12 20270
2 12 20298
3 23 20089
3 25 20211
3 28 20226
4 31 20089
4 12 20211
4 14 20226
4 27 20227
4 10 20241
5 20 20089
5 12 20211
5 13 20226
5 16 20241
end
format %tdnn/dd/CCYY resultdate

bysort id (resultdate) : gen pseudot = _n

tsset id pseudot
       panel variable:  id (unbalanced)
        time variable:  pseudot, 1 to 6
                delta:  1 unit

tsspell, cond(result >= 20 & result < .)

bysort id _spell (resultdate) : gen length = _N if _spell
(12 missing values generated)

sort id resultdate

list, sepby(id _spell)

     +-------------------------------------------------------------------+
     | id   result   resultd~e   pseudot   _seq   _spell   _end   length |
     |-------------------------------------------------------------------|
  1. |  1       20    1/1/2015         1      1        1      0        6 |
  2. |  1       20    5/3/2015         2      2        1      0        6 |
  3. |  1       35   5/18/2015         3      3        1      0        6 |
  4. |  1       28   5/19/2015         4      4        1      0        6 |
  5. |  1       33    6/2/2015         5      5        1      0        6 |
  6. |  1       32   7/12/2015         6      6        1      1        6 |
     |-------------------------------------------------------------------|
  7. |  2       12    1/1/2015         1      0        0      0        . |
  8. |  2       11    5/3/2015         2      0        0      0        . |
  9. |  2       13   5/18/2015         3      0        0      0        . |
 10. |  2       12    6/2/2015         4      0        0      0        . |
 11. |  2       12    7/1/2015         5      0        0      0        . |
 12. |  2       12   7/29/2015         6      0        0      0        . |
     |-------------------------------------------------------------------|
 13. |  3       23    1/1/2015         1      1        1      0        3 |
 14. |  3       25    5/3/2015         2      2        1      0        3 |
 15. |  3       28   5/18/2015         3      3        1      1        3 |
     |-------------------------------------------------------------------|
 16. |  4       31    1/1/2015         1      1        1      1        1 |
     |-------------------------------------------------------------------|
 17. |  4       12    5/3/2015         2      0        0      0        . |
 18. |  4       14   5/18/2015         3      0        0      0        . |
     |-------------------------------------------------------------------|
 19. |  4       27   5/19/2015         4      1        2      1        1 |
     |-------------------------------------------------------------------|
 20. |  4       10    6/2/2015         5      0        0      0        . |
     |-------------------------------------------------------------------|
 21. |  5       20    1/1/2015         1      1        1      1        1 |
     |-------------------------------------------------------------------|
 22. |  5       12    5/3/2015         2      0        0      0        . |
 23. |  5       13   5/18/2015         3      0        0      0        . |
 24. |  5       16    6/2/2015         4      0        0      0        . |
     +-------------------------------------------------------------------+

. 
.

Last edited by Nick Cox; 17 Feb 2020, 02:19.

Comment

May Blake

Join Date: Feb 2020

Posts: 137
#4

17 Feb 2020, 18:42

Thank you so much Nick and Clyde. Both of these solutions worked very well for my dataset. Thank you for taking the time to help me out.
Comment

Announcement

How to find 2 consecutive values for a dataset with repeated observations

Comment

Comment

Comment