appropriately deleting dummy 1s

Christian Vienna

Join Date: May 2017

Posts: 24
#1

appropriately deleting dummy 1s

14 Mar 2019, 11:34

Hi,

I got a bit of a puzzle problem for one specific application, but I feel an appropriate solution could be useful for other applications.

I have a Dummy-variable, D1, which has, sometimes, several consecutive 1s. D1 depends on the value of var1, and takes value 1 whenever var1 is greater than 0.

I want to generate a second dummy variable, D2, which takes value 1 every time D1 has 5 or more consecutive 1s. So, I want D2 to have value 1, at the start of a D1 succession of 1 values.

Code:

gen D1 = . //gen dummy var replace D1=1 if var1[_n]>0 // define that D1 is 1, whenever var1 is larger than 0 replace D1=0 if D1== . replace D1=0 if var1==. // stata places 1s where var1=., so we need to correct

this first part of my code creates something like this

Code:

D2 D1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0

then, in the second part of my code I want to delete all 1s that are not at the beginning of a succession of D1 1s.

Code:

gen D2 = 0 replace D2=1 if (D1[_n]+D1[_n+1] + D1ndic[_n+2] + D1[_n+3] + D1[_n+4])==5 replace D2=0 if D2[_n-1]==1

However, the code above creates the problem you can see below, where D2 does not just take the value 1 at the beginning of the succession.

Code:

D2 D1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 // here D2 should have value 0 0 1 1 1 // here D2 should have value 0 0 1 1 1 // here D2 should have value 0 0 1 1 1 // here D2 should have value 0 0 1 1 1 // here D2 should have value 0 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0

any suggestions on how to solve this issue?

many thanks!
Christian
Tags: None
Sarah Edgington

Join Date: Apr 2014

Posts: 284
#2

14 Mar 2019, 12:45

To start with you have some naming inconsistencies between your presented code snippets and the data you're displaying. Your first chunk of code most definitely did NOT create the data displayed in your second set of code, if only because there is no reference to D2 at all.

You can simplify your creation of D1 from var1 with the following

Code:

mark D1 if var1>0 & var1<. or gen D1=(var1>0 & var1<.)

You also have a typo in your second snippet of code. D1ndic does not exist.

The core of your problem, though, is that you're changing the value of D2 based on the previous value of the same variable. Stata steps sequentially through observations to perform actions. So when Stata gets to line 7 of your data D2 is initially 1, it changes it to zero because the value of D2 in observation 6 is 1. Now when Stata gets to observation 8, it looks at the value of D2 in observation 7. That value is now 0, not 1. So observation 8 remains unchanged. The easy way to fix this is to just create a new variable to use for the comparison.

Try

Code:

gen dx=D2 replace D2=0 if dx[_n-1]==1 drop dx
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10291

14 Mar 2019, 13:00

You may want to create a grouping variable first and then you can do whatever. Here is one way.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float D1
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
1
1
1
1
0
1
1
1
1
1
end


*tag start-points
gen tag = D1==1 & D1[_n-1]==0

*tag end-points
replace tag=2 if  D1==1 & D1[_n+1]==0

gen tag2= tag
replace tag2 = tag2[_n-1] if tag2 ==0

*gen group var
gen group = D1==1 & D1[_n-1]==0
replace group= sum(group) if group!=0
replace tag2=0 if group!=0
replace group= group[_n-1] if tag2==1 & _n!=1
replace group= group[_n-1] if D1==1 & group[_n-1]!=0
drop tag*

*gen order var to restore order after manipulations
gen order=_n

Result:

Code:

. l, sepby(group)

     +--------------------+
     | D1   group   order |
     |--------------------|
  1. |  0       0       1 |
  2. |  0       0       2 |
  3. |  0       0       3 |
  4. |  0       0       4 |
  5. |  0       0       5 |
  6. |  0       0       6 |
     |--------------------|
  7. |  1       1       7 |
  8. |  1       1       8 |
  9. |  1       1       9 |
 10. |  1       1      10 |
 11. |  1       1      11 |
 12. |  1       1      12 |
 13. |  1       1      13 |
 14. |  1       1      14 |
 15. |  1       1      15 |
 16. |  1       1      16 |
 17. |  1       1      17 |
 18. |  1       1      18 |
 19. |  1       1      19 |
 20. |  1       1      20 |
 21. |  1       1      21 |
     |--------------------|
 22. |  0       0      22 |
 23. |  0       0      23 |
 24. |  0       0      24 |
 25. |  0       0      25 |
 26. |  0       0      26 |
     |--------------------|
 27. |  1       2      27 |
 28. |  1       2      28 |
 29. |  1       2      29 |
 30. |  1       2      30 |
     |--------------------|
 31. |  0       0      31 |
     |--------------------|
 32. |  1       3      32 |
 33. |  1       3      33 |
 34. |  1       3      34 |
 35. |  1       3      35 |
 36. |  1       3      36 |
     +--------------------+

Announcement

appropriately deleting dummy 1s

Comment

Comment