Help with reshape after using rangejoin to select match 5 unexposed per exposed

Paul Dickman

Join Date: Apr 2014

Posts: 294
#1

Help with reshape after using rangejoin to select match 5 unexposed per exposed

13 Mar 2023, 03:41

I have a dataset with exposed and wish to randomly select up to 5 unexposed for each exposed (without replacement) matched on sex, year, and age (+/- 5 years). Following is some example code. I'd appreciate all comments on how to best do this, but I'm particularly interested in the best way to reshape the data after range join so I have one observation per individual. There should be a variable exposed indicating exposure status and a variable pair_id that will indicate the matched sets.

My aim is to generate a matched cohort study; not a nested case-control study (i.e., risk set sampling).

I'm confident I can get from where I am to where I want to be, but I'm thinking there may be a better approach to the one I am taking.

Code:

use http://pauldickman.com/software/stata/exposed, clear // For each observation in exposed, select all unexposed // with same sex and year of diagnosis with age +/- 5 years rangejoin age -5 5 using http://pauldickman.com/software/stata/unexposed, by(sex yydx) // randomly select 5 unexposed if there are more than 5 matches set seed 8675309 gen double shuffle = runiform() by id (shuffle), sort: keep if _n <= 5 drop shuffle // reshape from wide format to long format rename age age1 rename status status1 rename dx dx1 rename exit exit1 rename age_U age2 rename status_U status2 rename dx_U dx2 rename exit_U exit2 reshape long age status dx exit, i(id id_U) j(exp)
Tags: None
Paul Dickman

Join Date: Apr 2014

Posts: 294
#2

13 Mar 2023, 05:23

If anyone is interested, here's my inelegant solution. All comments and suggestions appreciated.

http://pauldickman.com/software/stata/matching.do

I'm currently teaching a course and one of the participants asked me how to generate a matched cohort study. I couldn't find a good link in my quick google search so I wrote this sample code. Interestingly, I've worked in epidemiology for over 20 years but have never had to do exactly this.
Comment

Announcement

Help with reshape after using rangejoin to select match 5 unexposed per exposed

Comment