Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with reshape after using rangejoin to select match 5 unexposed per exposed

    I have a dataset with exposed and wish to randomly select up to 5 unexposed for each exposed (without replacement) matched on sex, year, and age (+/- 5 years). Following is some example code. I'd appreciate all comments on how to best do this, but I'm particularly interested in the best way to reshape the data after range join so I have one observation per individual. There should be a variable exposed indicating exposure status and a variable pair_id that will indicate the matched sets.

    My aim is to generate a matched cohort study; not a nested case-control study (i.e., risk set sampling).

    I'm confident I can get from where I am to where I want to be, but I'm thinking there may be a better approach to the one I am taking.

    Code:
    use http://pauldickman.com/software/stata/exposed, clear
    
    // For each observation in exposed, select all unexposed
    // with same sex and year of diagnosis with age +/- 5 years
    rangejoin age -5 5 using http://pauldickman.com/software/stata/unexposed, by(sex yydx)
    
    // randomly select 5 unexposed if there are more than 5 matches
    set seed 8675309
    gen double shuffle = runiform()
    by id (shuffle), sort: keep if _n <= 5
    drop shuffle
    
    // reshape from wide format to long format
    rename age age1
    rename status status1
    rename dx dx1
    rename exit exit1
    
    rename age_U age2
    rename status_U status2
    rename dx_U dx2
    rename exit_U exit2
    
    reshape long age status dx exit, i(id id_U) j(exp)

  • #2
    If anyone is interested, here's my inelegant solution. All comments and suggestions appreciated.

    http://pauldickman.com/software/stata/matching.do

    I'm currently teaching a course and one of the participants asked me how to generate a matched cohort study. I couldn't find a good link in my quick google search so I wrote this sample code. Interestingly, I've worked in epidemiology for over 20 years but have never had to do exactly this.

    Comment

    Working...
    X