Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping rows based on their position related to another row

    Hello!

    I'm trying to drop rows based on their position related to another row specified in a compound 'if' statement. For example, if I have the following data:
    PHP Code:
         stringvar   flagvar |
         |---------------------|
      
    1. |     aaaa          . |
      
    2. |     bbbb          . |
      
    3. |     ****          . |
      
    4. |     cccc          . |
      
    5. |     dddd          . |
      
    6. |     ****          |
      
    7. |     eeee          . |
      
    8. |     ffff          . | 
    I'm trying to write a code that'll allow me to delete the three rows above the row with a value of 1 in flagvar (observation 6 here) and with the additional specification that a value of "****" should be in stringvar three rows above the row flagged with a 1 in flagvar. So something like
    PHP Code:
    drop in _n-1 _n-2 _n-if flagvar==stringvar[_n-3]=="****" 
    Obviously, that code doesn't work, but it's just to illustrate what I'm trying to do. In other words, I want to be able to drop the rows above a row flagged with a value of 1 in flagvar up to and including the next "****" in stringvar in descending order. If run properly, my data would then look like this:
    PHP Code:
         stringvar   flagvar |
         |---------------------|
      
    1. |     aaaa          . |
      
    2. |     bbbb          . |
      
    3. |     ****          |
      
    4. |     eeee          . |
      
    5. |     ffff          . | 
    Is there a way to do this in Stata? I do not want to make reference to specific observation numbers at all in my code because I'm writing a program that could be applied to other similar datasets.

    Thanks!

  • #2
    In the future, when showing data examples, please use the -dataex- command to do so, as I have in the code below. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Having done the necessary "surgery" on your -list- output to make a data set out of it, the following code will do what you need:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 stringvar byte flagvar
    "aaaa" .
    "bbbb" .
    "****" .
    "cccc" .
    "dddd" .
    "****" 1
    "eeee" .
    "ffff" .
    end
    
    gen byte trigger = (flagvar == 1) & stringvar[_n-3] == "****"
    drop if inlist(1, trigger[_n+1], trigger[_n+2], trigger[_n+3])
    Additional unsolicited advice: for most purposes coding variables with 1 and missing value in Stata is a setup for problems and coding errors. Most Stata analyses will work best by coding dichotomies as 1 and 0. It doesn't happen to be a problem in this particular code, so I didn't change it, but if you have other uses of flagvar coming up, you might want to give it serious consideration.

    Comment


    • #3
      The following drops the observations you wanted dropped.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str4 stringvar byte flagvar
      "aaaa" .
      "bbbb" .
      "****" .
      "cccc" .
      "dddd" .
      "****" 1
      "eeee" .
      "ffff" .
      end
      
      generate flag2 = stringvar=="****" & flagvar[_n+3]==1
      drop if inlist(1,flag2,flag2[_n-1],flag2[_n-2])
      drop flag2
      list, clean abbreviate(12)
      Code:
      . list, clean abbreviate(12)
      
             stringvar   flagvar  
        1.        aaaa         .  
        2.        bbbb         .  
        3.        ****         1  
        4.        eeee         .  
        5.        ffff         .

      Comment


      • #4
        Thanks, Clyde! I will definitely use -dataex- from now on when I post questions on Statalist.

        I didn't even realize that I could populate a variable with 1s and 0s just by doing gen newvar = flagvar==1. I thought it had to be gen newvar = 1 if flagvar==1, and then another line, replace newvar=0 if newvar==.

        I obviously have a lot to learn...

        Comment


        • #5
          You may find https://www.stata.com/support/faqs/d...rue-and-false/ of interest.

          Comment

          Working...
          X