Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drawing a random sample of subjects for manual review

    Hello,
    I am trying to randomly sample a sub-data for review. I have a dataset with 17,000 encounters and I want to draw a random sample of 1500. I have several factors I want to sample on.

    1 - Within a before and after implementation of surgery protocol
    2 - Within the department (4 departments)
    3 - Within the surgeons in the health care facility
    4 - Within the protocol use variable (whether the surgeon adopted the protocol or not)
    5 - Within the patient encounters as some patients visited multiple times

    I am using the code below but my data returns all empty cells

    [Code]
    sample 8.8, by(time dept surgeon prot enc)

  • #2
    Hello May Blake. I wonder if your surgeon variable is causing trouble. Presumably, surgeons are not crossed with all other variables in your list. Rather, they are clustered within HC facilities. Does it work better if you omit surgeon?
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 19.5 (Windows)

    Comment


    • #3
      Thank you Bruce, Surgeons are clustered within departments. Even when I removed that variable, I am still getting empty cells so not sure what could be the problem.

      Comment


      • #4
        If you cross-tabulate all of those variables on the full dataset, do you see any empty cells?
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 19.5 (Windows)

        Comment


        • #5
          Hi Bruce,

          Here is a sample of my data below. There are some missing fields but there are filled in with the string of "Missing" they are not blank.

          ---------------------- copy starting from the next line -----------------------
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str4 time str5 dept str6 surgeon str7 prot byte enc
          "Pre"  "ONC"   "DANIEL" "YES"      3
          "Pre"  "SURG1" "LEDDY"  "YES"      1
          "Pre"  "SURG2" "EVA"    "NO"       2
          "Pre"  "ONC"   "LEDDY"  "MISSING"  1
          "Pre"  "SURG3" "MATT"   "MISSING"  4
          "Pre"  "ONC"   "MATT"   "NO"       4
          "Post" "SURG1" "LEDDY"  "NO"       6
          "Post" "SURG3" "MATT"   "YES"      7
          "Post" "ONC"   "DANIEL" "MISSING"  8
          "Post" "SURG2" "EVA"    "MISSING"  9
          "Post" "SURG2" "PRINCE" "NO"      10
          "Post" "ONC"   "SADDIE" "YES"     11
          "Post" "OONC"  "TRACE"  "MISSING" 12
          end
          ------------------ copy up to and including the previous line ------------------

          Listed 13 out of 13 observations

          Comment


          • #6
            May, have you considered creating 5 separate flags for each listed condition, and then drawing a sample from each set of flagged observations?

            Comment


            • #7
              Leonardo, I have not but that does sound like what I am looking for. Unfortunately, I don't have a clue on where to start for that. Would you recommend creating sub-data for each condition and then drawing from that sub-data?

              Comment


              • #8
                That would be a reasonable approach to get started. I’m not really sure how to help you get started from the data provided.

                Comment

                Working...
                X