Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying events across different weeks

    Good day All,

    We are looking at participants who miss >=2 doses in a total of 4 weeks. In other words, if they have missed >=2 doses within the same week, they are not flagged, but if they miss >=2 doses in week 1 and week 2 or in week 1, week 3 and week 4 then they are flagged. The missed doses dont necessarily have to be missed in consecutive weeks. I have tried for hours on end, but haven't been successful. Any assistance on how to tackle this task will be highly appreciated.

    I have added a sample of my data below


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int Date float(week total_doses_missed1 PID)
    24197 1 0 1
    24196 1 0 1
    24193 1 0 1
    24195 1 0 1
    24194 1 0 1
    24198 1 0 1
    24192 1 0 1
    24204 2 0 1
    24203 2 0 1
    24201 2 0 1
    24199 2 0 1
    24200 2 0 1
    24202 2 0 1
    24205 2 0 1
    24207 3 0 1
    24211 3 0 1
    24209 3 0 1
    24212 3 0 1
    24206 3 0 1
    24210 3 0 1
    24208 3 0 1
    24213 4 0 1
    24219 4 0 1
    24217 4 0 1
    24216 4 0 1
    24214 4 0 1
    24218 4 0 1
    24215 4 0 1
    24196 1 0 2
    24195 1 0 2
    24198 1 0 2
    24197 1 0 2
    24194 1 0 2
    24192 1 0 2
    24193 1 0 2
    24203 2 0 2
    24205 2 0 2
    24200 2 0 2
    24202 2 0 2
    24201 2 0 2
    24199 2 0 2
    24204 2 0 2
    24208 3 0 2
    24212 3 0 2
    24209 3 0 2
    24210 3 0 2
    24211 3 0 2
    24206 3 0 2
    24207 3 0 2
    24218 4 0 2
    24216 4 0 2
    24213 4 0 2
    24219 4 0 2
    24214 4 0 2
    24217 4 0 2
    24215 4 0 2
    24193 1 0 3
    24196 1 0 3
    24192 1 0 3
    24195 1 0 3
    24198 1 0 3
    24194 1 0 3
    24197 1 0 3
    24205 2 0 3
    24204 2 0 3
    24199 2 0 3
    24203 2 0 3
    24200 2 0 3
    24201 2 0 3
    24202 2 0 3
    24210 3 0 3
    24212 3 0 3
    24206 3 0 3
    24207 3 0 3
    24208 3 0 3
    24211 3 0 3
    24209 3 0 3
    24218 4 0 3
    24216 4 0 3
    24219 4 0 3
    24213 4 0 3
    24217 4 0 3
    24215 4 0 3
    24214 4 0 3
    24211 3 0 4
    24212 3 0 4
    24219 4 0 4
    24216 4 0 4
    24213 4 0 4
    24217 4 0 4
    24214 4 0 4
    24218 4 0 4
    24215 4 0 4
    24194 1 0 5
    24193 1 0 5
    24192 1 0 5
    24197 1 0 5
    24196 1 0 5
    24198 1 0 5
    24195 1 0 5
    24201 2 0 5
    24204 2 0 5
    24205 2 0 5
    24200 2 0 5
    24199 2 0 5
    24202 2 0 5
    24203 2 0 5
    24207 3 0 5
    24208 3 0 5
    24206 3 0 5
    24211 3 0 5
    24209 3 0 5
    24212 3 0 5
    24210 3 0 5
    24218 4 0 5
    24217 4 0 5
    24213 4 0 5
    24214 4 0 5
    24216 4 0 5
    24215 4 0 5
    24219 4 0 5
    24192 1 1 6
    24196 1 1 6
    24193 1 1 6
    24198 1 1 6
    24194 1 1 6
    24197 1 1 6
    24195 1 1 6
    24199 2 1 6
    24202 2 1 6
    24203 2 1 6
    24205 2 1 6
    24204 2 1 6
    24201 2 1 6
    24200 2 1 6
    24211 3 0 6
    24210 3 0 6
    24206 3 0 6
    24208 3 0 6
    24207 3 0 6
    24212 3 0 6
    24209 3 0 6
    24213 4 0 6
    24217 4 0 6
    24214 4 0 6
    24218 4 0 6
    24216 4 0 6
    24215 4 0 6
    24219 4 0 6
    24198 1 0 7
    24197 1 0 7
    24195 1 0 7
    24192 1 0 7
    24196 1 0 7
    24193 1 0 7
    24194 1 0 7
    24199 2 0 7
    24202 2 0 7
    24205 2 0 7
    24203 2 0 7
    24200 2 0 7
    24204 2 0 7
    24201 2 0 7
    24207 3 0 7
    24206 3 0 7
    24209 3 0 7
    24208 3 0 7
    24210 3 0 7
    24212 3 0 7
    24211 3 0 7
    24219 4 0 7
    24216 4 0 7
    24217 4 0 7
    24218 4 0 7
    24215 4 0 7
    24214 4 0 7
    24213 4 0 7
    24194 1 2 8
    24192 1 2 8
    24198 1 2 8
    24196 1 2 8
    24195 1 2 8
    24197 1 2 8
    24193 1 2 8
    24199 2 2 8
    24200 2 2 8
    24205 2 2 8
    24204 2 2 8
    24202 2 2 8
    24201 2 2 8
    24203 2 2 8
    24207 3 3 8
    24210 3 3 8
    24211 3 3 8
    24209 3 3 8
    24206 3 3 8
    24208 3 3 8
    24212 3 3 8
    24219 4 2 8
    24217 4 2 8
    end
    format %td Date
    Thanks,

    Galenda

  • #2
    Well, you've actually already done the hard parts yourself. You've defined the weeks and tallied the number of missed doses for each PID within each week. So all we have to do is total up the number of different weeks for each PID where the number of missed doses is >=2. If that total is 2 or more, we have a "target"

    Code:
    //  VERIFY TOTAL DOSES MISSED IS CONSISTENT FOR ALL OBSERVATIONS OF THE SAME PID
    //  IN THE SAME WEEK
    by PID week (total_doses_missed1), sort: assert total_doses_missed1[1] == total_doses_missed1[_N]
    
    frame put PID week total_doses_missed1, into(working)
    frame working {
        duplicates drop
        by PID (week), sort: egen n_affected_weeks = total(total_doses_missed1 >= 2)
        gen byte is_target = (n_affected_weeks >= 2)
    }
    
    frlink m:1 PID week, frame(working)
    assert `r(unmatched)' == 0
    frget is_target, from(working)
    The variable is_target created in the last line gives the information you want.

    Notes: If either -assert- command produces an error message, then something has gone wrong and you should not proceed further: instead post back with example data that produces the problem. If the code runs without error messages but produces results that are not what you intend then it means that I have not correctly understood what you are looking to calculate. So, again, post back with an example where the results are incorrect and explain what you want the results to be and why.

    Added: -frame-s were introduced to Stata back in, if I recall correctly, version 16. If you are running an older version of Stata than that, this code will not work for you. In that case post back stating which version you are running, and I will craft alternative code that should work for you.

    Comment


    • #3
      Here's a take that may be complementary. In many ways using collapse to get the main result would be easier, but keeping results on different time scales in the same dataset can also be helpful.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int Date float(week total_doses_missed1 PID)
      24197 1 0 1
      24196 1 0 1
      24193 1 0 1
      24195 1 0 1
      24194 1 0 1
      24198 1 0 1
      24192 1 0 1
      24204 2 0 1
      24203 2 0 1
      24201 2 0 1
      24199 2 0 1
      24200 2 0 1
      24202 2 0 1
      24205 2 0 1
      24207 3 0 1
      24211 3 0 1
      24209 3 0 1
      24212 3 0 1
      24206 3 0 1
      24210 3 0 1
      24208 3 0 1
      24213 4 0 1
      24219 4 0 1
      24217 4 0 1
      24216 4 0 1
      24214 4 0 1
      24218 4 0 1
      24215 4 0 1
      24196 1 0 2
      24195 1 0 2
      24198 1 0 2
      24197 1 0 2
      24194 1 0 2
      24192 1 0 2
      24193 1 0 2
      24203 2 0 2
      24205 2 0 2
      24200 2 0 2
      24202 2 0 2
      24201 2 0 2
      24199 2 0 2
      24204 2 0 2
      24208 3 0 2
      24212 3 0 2
      24209 3 0 2
      24210 3 0 2
      24211 3 0 2
      24206 3 0 2
      24207 3 0 2
      24218 4 0 2
      24216 4 0 2
      24213 4 0 2
      24219 4 0 2
      24214 4 0 2
      24217 4 0 2
      24215 4 0 2
      24193 1 0 3
      24196 1 0 3
      24192 1 0 3
      24195 1 0 3
      24198 1 0 3
      24194 1 0 3
      24197 1 0 3
      24205 2 0 3
      24204 2 0 3
      24199 2 0 3
      24203 2 0 3
      24200 2 0 3
      24201 2 0 3
      24202 2 0 3
      24210 3 0 3
      24212 3 0 3
      24206 3 0 3
      24207 3 0 3
      24208 3 0 3
      24211 3 0 3
      24209 3 0 3
      24218 4 0 3
      24216 4 0 3
      24219 4 0 3
      24213 4 0 3
      24217 4 0 3
      24215 4 0 3
      24214 4 0 3
      24211 3 0 4
      24212 3 0 4
      24219 4 0 4
      24216 4 0 4
      24213 4 0 4
      24217 4 0 4
      24214 4 0 4
      24218 4 0 4
      24215 4 0 4
      24194 1 0 5
      24193 1 0 5
      24192 1 0 5
      24197 1 0 5
      24196 1 0 5
      24198 1 0 5
      24195 1 0 5
      24201 2 0 5
      24204 2 0 5
      24205 2 0 5
      24200 2 0 5
      24199 2 0 5
      24202 2 0 5
      24203 2 0 5
      24207 3 0 5
      24208 3 0 5
      24206 3 0 5
      24211 3 0 5
      24209 3 0 5
      24212 3 0 5
      24210 3 0 5
      24218 4 0 5
      24217 4 0 5
      24213 4 0 5
      24214 4 0 5
      24216 4 0 5
      24215 4 0 5
      24219 4 0 5
      24192 1 1 6
      24196 1 1 6
      24193 1 1 6
      24198 1 1 6
      24194 1 1 6
      24197 1 1 6
      24195 1 1 6
      24199 2 1 6
      24202 2 1 6
      24203 2 1 6
      24205 2 1 6
      24204 2 1 6
      24201 2 1 6
      24200 2 1 6
      24211 3 0 6
      24210 3 0 6
      24206 3 0 6
      24208 3 0 6
      24207 3 0 6
      24212 3 0 6
      24209 3 0 6
      24213 4 0 6
      24217 4 0 6
      24214 4 0 6
      24218 4 0 6
      24216 4 0 6
      24215 4 0 6
      24219 4 0 6
      24198 1 0 7
      24197 1 0 7
      24195 1 0 7
      24192 1 0 7
      24196 1 0 7
      24193 1 0 7
      24194 1 0 7
      24199 2 0 7
      24202 2 0 7
      24205 2 0 7
      24203 2 0 7
      24200 2 0 7
      24204 2 0 7
      24201 2 0 7
      24207 3 0 7
      24206 3 0 7
      24209 3 0 7
      24208 3 0 7
      24210 3 0 7
      24212 3 0 7
      24211 3 0 7
      24219 4 0 7
      24216 4 0 7
      24217 4 0 7
      24218 4 0 7
      24215 4 0 7
      24214 4 0 7
      24213 4 0 7
      24194 1 2 8
      24192 1 2 8
      24198 1 2 8
      24196 1 2 8
      24195 1 2 8
      24197 1 2 8
      24193 1 2 8
      24199 2 2 8
      24200 2 2 8
      24205 2 2 8
      24204 2 2 8
      24202 2 2 8
      24201 2 2 8
      24203 2 2 8
      24207 3 3 8
      24210 3 3 8
      24211 3 3 8
      24209 3 3 8
      24206 3 3 8
      24208 3 3 8
      24212 3 3 8
      24219 4 2 8
      24217 4 2 8
      end
      format %td Date
      
      bysort PID week : egen n_missed = total(total_doses_missed) 
      
      egen tag = tag(PID week)
      
      bysort tag PID (week) : gen profile = strofreal(n_missed) if tag & week == 1 
      
      by tag PID : replace profile = profile[_n-1] + " " + strofreal(n_missed) if tag & week > 1 
      
      by tag PID : replace profile = profile[_N]
      
      list PID profile if tag & week == 1 , noobs sep(0)
      Code:
        +------------------+
        | PID      profile |
        |------------------|
        |   1      0 0 0 0 |
        |   2      0 0 0 0 |
        |   3      0 0 0 0 |
        |   5      0 0 0 0 |
        |   6      7 7 0 0 |
        |   7      0 0 0 0 |
        |   8   14 14 21 4 |
        +------------------+

      Comment


      • #4
        Thank you so much Clyde and Nick for your assistance. Clyde Schechter using the assert command did not give any errors, and neither did the frames code since I am using version 18.

        Since we are interested in knowing the participants that missed doses and not the specific date they missed, I used the route of collapsing the data first, thanks Nick Cox for the tip. My final code is


        Code:
        * collapsing the data to have one observation per week
        
        collapse (min) total_doses_missed1, by( PID week)
        
        * Creating variable that shows the number of weeks in which a participant missed >=2 doses
        by PID (week), sort: egen n_affected_weeks = total(total_doses_missed1 >= 2)
        
        * Creating a variable that shows 1 for participants who missed >= 2 doses in more than one week.
        gen byte target = (n_affected_weeks >= 2)

        Comment

        Working...
        X