Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • keep command

    Hello there
    I have the following data and would like to keep the children's records where:

    surgicalindex = 1 --> easy part

    command used:
    keep if surgicalindex == 1

    & (which is where my problem lies) :

    all the episodeno preceeding surgicalindex == 1

    As the episodeno may vary for each IDno can you recommend any syntax?
    (assuming the dataset if over 500 records - this is only a snapshot, so I can not individually write the episodeno... as there are over 500 records with various associated episode nos.

    I tried:

    keep if admidate <= surgicalindex == 1

    But I clearly I'm writing adequately for stata to understand me as I am left with only 1 record.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(dvt procedureid IDno MI surgicalindex episodeno admidate)
    1   . 1 0 0 1   0
    1 110 1 0 1 2  31
    0   . 2 1 0 1  60
    1   . 2 1 1 2 305
    0 112 2 1 0 3 335
    end
    format %td admidate
    I would like to see the following kept (red) and the black dropped as the surgicalindex != 1 and the admidate is > the admidate of surgicalindex = 1 for Idno child 2


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(dvt procedureid IDno MI surgicalindex episodeno admidate)
    1   . 1 0 0 1   0
    1 110 1 0 1 2  31
    0   . 2 1 0 1  60
    1   . 2 1 1 2 305
    0 112 2 1 0 3 335
    end
    format %td admidate


    Can you recommend appropriate syntax/commands?

  • #2
    Code:
    by IDno (episodeno), sort: keep if sum(surgicalindex[_n-1]) == 0

    Comment


    • #3
      Try this. Note that you require either of two conditions, so the logical operator is OR ( | ) and not AND (&).

      Code:
      bys IDno: egen last_surgical=max(cond( surgicalindex==1,admidate,.))
      format last_surgical %td
      keep if surgicalindex==1 | admidate <=last_surgical

      Comment


      • #4
        Originally posted by alejoforero View Post
        Try this. Note that you require either of two conditions, so the logical operator is OR ( | ) and not AND (&).

        Code:
        bys IDno: egen last_surgical=max(cond( surgicalindex==1,admidate,.))
        format last_surgical %td
        keep if surgicalindex==1 | admidate <=last_surgical
        With regards to:

        bys IDno: egen last_surgical=max(cond( surgicalindex==1,admidate,.))

        This is telling stata
        1. Sort Id No
        2. Generate last_surgical variable which is equal to the maximum admidate when surgicalindex ==1

        Is this correct?

        The reason why I'm asking as stata help says
        cond(x,a,b[,c]) - I've put in my logical reasoning in red text.



        define your x My x = SURGICALINDEX==1,

        a if x is true and nonmissing, b if x is false (max admindate if surgicalindex !.=1 ) , and c if x is missing (if surgicalindex = . remains . ) ; a if c is not specified

        and x evaluates to missing


        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Code:
          by IDno (episodeno), sort: keep if sum(surgicalindex[_n-1]) == 0
          The problem with this code is that if I have a child 2 had a procedure straight after another procedure then the sum will be 2 and not zero (as seen in the red text here). So for this reason I think I’ll be using the condition command, although despite reading Dr Cox article from stata list journal I still don’t understand the command above (red text) when read in relation to the statalistjournal and stata help.


          input float(dvt procedureid IDno MI surgicalindex episodeno admidate) 1 . 1 0 0 1 0 1 110 1 0 1 2 31 0 . 2 1 0 1 60 1 . 2 1 1 2 305 0 112 2 1 1 3 335

          Comment


          • #6
            Your -dataex- got garbled in posting and is unusable. So I can't demonstrate how it works on that example. But I have made an example of my own. You are correct in noting that the sum will be 2 in the observation preceding the last procedure. But that doesn't matter. The code still correctly keeps only the observations leading up to the first date with a surgical index = 1, because those, and only those, are the observations for which the running sum is 0. That the sum later goes up to 2 instead of 1 makes no difference. Keeping everything up to the first surgical index date is what, in #1, I understood you to want. If that is not what you actually want, you need to restate your problem more clearly so that appropriate code can be written for it.

            Code:
            . * Example generated by -dataex-. For more info, type help dataex
            . clear
            
            . input float(dvt procedureid IDno MI surgicalindex episodeno admidate)
            
                       dvt  procedu~d       IDno         MI  surgica~x  episodeno   admidate
              1. 1   . 1 0 0 1     0
              2. 1 110 1 0 1 2    31
              3. 0   . 2 1 0 1    60
              4. 1   . 2 1 1 2   305
              5. 0 112 2 1 0 3   335
              6. .   . 3 . 0 1 22922
              7. .   . 3 . 1 2 22924
              8. .   . 3 . 1 3 22925
              9. end
            
            . format %td admidate
            
            .
            . list, noobs clean
            
                dvt   proced~d   IDno   MI   surgic~x   episod~o    admidate  
                  1          .      1    0          0          1   01jan1960  
                  1        110      1    0          1          2   01feb1960  
                  0          .      2    1          0          1   01mar1960  
                  1          .      2    1          1          2   01nov1960  
                  0        112      2    1          0          3   01dec1960  
                  .          .      3    .          0          1   04oct2022  
                  .          .      3    .          1          2   06oct2022  
                  .          .      3    .          1          3   07oct2022  
            
            . by IDno (episodeno), sort: keep if sum(surgicalindex[_n-1]) == 0
            (2 observations deleted)
            
            . list, noobs clean
            
                dvt   proced~d   IDno   MI   surgic~x   episod~o    admidate  
                  1          .      1    0          0          1   01jan1960  
                  1        110      1    0          1          2   01feb1960  
                  0          .      2    1          0          1   01mar1960  
                  1          .      2    1          1          2   01nov1960  
                  .          .      3    .          0          1   04oct2022  
                  .          .      3    .          1          2   06oct2022

            Comment

            Working...
            X