Dropping rows based on their position related to another row

Robert Eldritch

Join Date: Oct 2017

Posts: 12
#1

Dropping rows based on their position related to another row

03 Mar 2019, 11:29

Hello!

I'm trying to drop rows based on their position related to another row specified in a compound 'if' statement. For example, if I have the following data:

PHP Code:

| stringvar flagvar | |---------------------| 1. | aaaa . | 2. | bbbb . | 3. | **** . | 4. | cccc . | 5. | dddd . | 6. | **** 1 | 7. | eeee . | 8. | ffff . |

I'm trying to write a code that'll allow me to delete the three rows above the row with a value of 1 in flagvar (observation 6 here) and with the additional specification that a value of "****" should be in stringvar three rows above the row flagged with a 1 in flagvar. So something like

PHP Code:

drop in _n-1 _n-2 _n-3 if flagvar==1 & stringvar[_n-3]=="****"

Obviously, that code doesn't work, but it's just to illustrate what I'm trying to do. In other words, I want to be able to drop the rows above a row flagged with a value of 1 in flagvar up to and including the next "****" in stringvar in descending order. If run properly, my data would then look like this:

PHP Code:

| stringvar flagvar | |---------------------| 1. | aaaa . | 2. | bbbb . | 3. | **** 1 | 4. | eeee . | 5. | ffff . |

Is there a way to do this in Stata? I do not want to make reference to specific observation numbers at all in my code because I'm writing a program that could be applied to other similar datasets.

Thanks!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

03 Mar 2019, 12:03

In the future, when showing data examples, please use the -dataex- command to do so, as I have in the code below. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.

Having done the necessary "surgery" on your -list- output to make a data set out of it, the following code will do what you need:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str4 stringvar byte flagvar "aaaa" . "bbbb" . "****" . "cccc" . "dddd" . "****" 1 "eeee" . "ffff" . end gen byte trigger = (flagvar == 1) & stringvar[_n-3] == "****" drop if inlist(1, trigger[_n+1], trigger[_n+2], trigger[_n+3])

Additional unsolicited advice: for most purposes coding variables with 1 and missing value in Stata is a setup for problems and coding errors. Most Stata analyses will work best by coding dichotomies as 1 and 0. It doesn't happen to be a problem in this particular code, so I didn't change it, but if you have other uses of flagvar coming up, you might want to give it serious consideration.
1 like
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

03 Mar 2019, 12:06

The following drops the observations you wanted dropped.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 stringvar byte flagvar
"aaaa" .
"bbbb" .
"****" .
"cccc" .
"dddd" .
"****" 1
"eeee" .
"ffff" .
end

generate flag2 = stringvar=="****" & flagvar[_n+3]==1
drop if inlist(1,flag2,flag2[_n-1],flag2[_n-2])
drop flag2
list, clean abbreviate(12)

Code:

. list, clean abbreviate(12)

       stringvar   flagvar  
  1.        aaaa         .  
  2.        bbbb         .  
  3.        ****         1  
  4.        eeee         .  
  5.        ffff         .

Comment

Robert Eldritch

Join Date: Oct 2017

Posts: 12
#4

03 Mar 2019, 18:53

Thanks, Clyde! I will definitely use -dataex- from now on when I post questions on Statalist.

I didn't even realize that I could populate a variable with 1s and 0s just by doing gen newvar = flagvar==1. I thought it had to be gen newvar = 1 if flagvar==1, and then another line, replace newvar=0 if newvar==.

I obviously have a lot to learn...
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35642
#5

04 Mar 2019, 01:41

You may find https://www.stata.com/support/faqs/d...rue-and-false/ of interest.
Comment

Announcement

Dropping rows based on their position related to another row

Comment

Comment

Comment

Comment