Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keeping observations within a daterange of a focal observation of interest

    Dear all,

    I'm trying to keep observations that satisfy a relatively simple condition, but am unable to do so.
    I want to keep all observations within 365 days of a non-missing "score" observation, and that for each ID (I was thinking about a bysort ID: keep if ... expression, but couldn't come up with anything that works).
    So, using the below example, for ID "045429", with score "46.561", I'd like to keep the four observations before that as well, since those are within 365 days of the focal observation of interest.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6 ID float date_completed double score
    "045429" 19676                .
    "045429" 19683                .
    "045429" 19927                .
    "045429" 20104                .
    "045429" 20251                .
    "045429" 20292                .
    "045429" 20842 79.4864121235747
    "045429" 20867 79.4864121235747
    "045429" 21160                .
    "045429" 21517                .
    "045429" 21517                .
    "045429" 21523                .
    "045429" 21551                .
    "045429" 21551                .
    "045429" 21592                .
    "045429" 21718                .
    "045429" 21922                .
    "045429" 21922                .
    "045429" 21948                .
    "045429" 22027                .
    "045429" 22211 46.5613695595023
    "04542L" 15330                .
    "04543M" 17533                .
    "04543M" 18567                .
    "04543M" 19159                .
    "04543P" 16919                .
    "04543P" 18568                .
    "04543V" 14889                .
    "04543V" 14889                .
    "04543V" 14889                .
    "04543V" 15288                .
    "04545E" 19464                .
    end
    format %d date_completed
    Thanks for any help!


  • #2
    Code:
    format %d date_completed
    gen plus= date_completed+365 if !missing(score)
    gen minus= date_completed-365 if !missing(score)
    bys ID (date_completed): replace plus = plus[_n-1] if plus==.
    gsort ID -date_completed
    by ID: replace minus = minus[_n-1] if minus==.
    gen wanted = (date_completed<plus & plus!=.)|(date_completed>minus & minus!=.)
    drop plus minus
    keep if wanted
    
    
    . sort ID date
    
    . list,sepby(ID) noobs
    
      +-----------------------------------------+
      |     ID   date_co~d       score   wanted |
      |-----------------------------------------|
      | 045429   23jan2017   79.486412        1 |
      | 045429   17feb2017   79.486412        1 |
      | 045429   07dec2017           .        1 |
      | 045429   08jan2020           .        1 |
      | 045429   08jan2020           .        1 |
      | 045429   03feb2020           .        1 |
      | 045429   22apr2020           .        1 |
      | 045429   23oct2020    46.56137        1 |
      +-----------------------------------------+

    Comment


    • #3
      Thanks a lot!
      Works perfectly.

      Comment

      Working...
      X