Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keep a group of observations if one record meets a certain condition

    Hi All,

    Here's an example of the sort of dataset that I'm working with:

    clear
    input studyid edvisits frequent_user
    1 0 1
    1 2 1
    1 1 1
    1 0 1
    1 0 1
    2 0 0
    2 0 0
    2 0 0
    2 0 0
    2 0 0
    3 1 1
    3 0 1
    3 2 1
    3 0 1
    3 0 1
    4 1 0
    4 0 0
    4 0 0
    4 0 0
    4 1 0
    end

    I am trying to subset out the data to create 3 separate datasets: people with no ed visits, people with few ed visits, and people with frequent ed visits.

    For example, the "few ed visits" dataset would contain all the records for the person with studyid #4 and the the "frequent ed visits" dataset would contain
    all records for studyid # 1 and 3

    By looking around the forums I think the code should look something like this:
    *no edvisits*
    bysort studyid (edvisits) : drop if edvisits[1] > 0

    *few edvisits*
    bysort studyid (edvisits) : drop if edvisits[1] == 0
    bysort studyid (frequent_user) : drop if frequent_user == 1

    *frequent edvisits*
    bysort studyid (edvisits) : drop if edvisits[1] == 0
    bysort studyid (frequent_user) : keep if frequent_user == 1

    but I don't seem to be having any success so I'm clearly missing something.

    think i've gotten close to the answer with these these two posts, but like I said, no luck so far. Any help would be much appreciated: https://www.stata.com/statalist/archive/2005-08/msg00361.html
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1396103-drop-whole-group-of-observations-if-one-fulfils-condition
    Last edited by Mike Reid; 28 Oct 2019, 11:21. Reason: formatting got messed up when I posted

  • #2
    Nothing to do with luck; everything to do with logic. (Indeed; that is likely to seem irritating...)

    I don't see the need for separate datasets. That will inhibit, indeed in some senses prohibit, comparison of these three groups. Why not just tag them 1 2 3?

    A rule for no visits is, I suggest, that the maximum number of visits is zero (the minimum number of visits being zero, as in your code, is necessary, but not sufficient). That is the first leg of three:

    Code:
    bysort studyid (edvisits) : gen wanted = edvisits[_N] == 0
    That assigns 1 to those with no visits ever and 0 to everyone else. We need to split the 0s. By some mysterious process you already know who your frequent users are.

    Code:
    replace wanted = 3 if frequent_user == 1
    What's left are the others

    Code:
    replace wanted = 2 if wanted == 0  
    
    label def wanted 1 no 2 few 3 frequent 
    label val wanted wanted 
    tab wanted

    Comment


    • #3
      "Nothing to do with luck; everything to do with logic." I think i'll print that out and post it next to my computer. It will keep things in perspective. This worked perfectly. Thank you.
      Mike

      Comment

      Working...
      X