simulating competing risks aanalysis

Andrew Xavier

Join Date: Jul 2018
Posts: 57

simulating competing risks aanalysis

19 Jul 2023, 06:07

Hi all

Please could you help me to figure out how to simulate competing risks survival data?

I am aware that I can achieve this using survsim command but I would like to generate this from first principles.

The idea is to simulate a HR for treatment for the event of interest and then have a HR for treatment for the competing risk. I want to understand how much the HR from the cox regression is biased compared to the HR from stcrreg. In the code, I tried to assume minimal censoring but this can be modified later.

I have tried to do this with the following code but it does not seem correct. Could you help me figure out the correct code?

From my understanding this is a key paper explaining the algorithm for the competing risks simulation which I am struggling to understand

Beyersmann, Jan, Aurelien Latouche, Anika Buchholz, and Martin Schumacher. "Simulating competing risks data in survival analysis." Statistics in medicine 28, no. 6 (2009): 956-971.

Code:

Code:

clear
set obs 3000

gen treatment = runiform()>0.5

gen double u = runiform()   

// 2 represents the competing risk and 1 represents the event of interest
gen risk_events = runiform()>0.95 // competing risk, 50/50 chance
recode risk_events 0=1 1=2

tab risk_events

// the hazard for the event of interest
local mediantime = 12
local b_cons = -ln(0.5)/`mediantime' 
display `b_cons'
gen xb1 = ln(`b_cons') + ln(0.70)*treatment
gen lambda1 = exp(xb1)

// the hazard for the competing risk
local mediantime = 7
local b_cons = -ln(0.5)/`mediantime'  
display `b_cons'
gen xb2 = ln(`b_cons') + ln(1)*treatment
gen lambda2 = exp(xb2)

gen t = -log(u)/lambda1 if risk_events==1
replace t = -log(u)/lambda2 if risk_events==2

// censoring time
gen cens=runiform(1000,2000)

// generate a non-informative censoring indicator
gen event = 1        
replace event = 0 if t>cens    

gen t0 = 0
replace t0 = t if event==1

gen state1 = 1
replace state1 = risk_events if event==1
replace state1 = 0 if event==0
tab event state1, col

gen t1 = t               
replace t1 = cens if event==0

stset t1, fail(state1=1)

sts graph, by(treatment)

stcox treatment
stcrreg treatment, compete(state1==2)

Thanks

A

Tags: None

Tiago Pereira

Join Date: Jan 2016

Posts: 403
#2

19 Jul 2023, 09:27

Check this article:

https://journals.sagepub.com/doi/pdf...6867X221083853
Comment
Andrew Xavier

Join Date: Jul 2018

Posts: 57
#3

19 Jul 2023, 10:02

Originally posted by Tiago Pereira View Post

Check this article:

https://journals.sagepub.com/doi/pdf...6867X221083853

Yes, I know this article, but uses survsim, I would like to know how to simulate the data without survsim
Comment
Tiago Pereira

Join Date: Jan 2016

Posts: 403
#4

19 Jul 2023, 10:31

Simulating competing risks models is a real maze, and even a tiny detail can mess up the whole setup. If you want to give it a shot from scratch, it is really good to learn the details, but be ready to put in a lot of time. Unfortunately, there's no easy "correct code" you can just plug in and play.

Last edited by Tiago Pereira; 19 Jul 2023, 10:34.
Comment
Andrew Xavier

Join Date: Jul 2018

Posts: 57
#5

19 Jul 2023, 10:38

Originally posted by Tiago Pereira View Post

Simulating competing risks models is a real maze, and even a tiny detail can mess up the whole setup. If you want to give it a shot from scratch, it is really good to learn the details, but be ready to put in a lot of time. Unfortunately, there's no easy "correct code" you can just plug in and play.

Could you help me with the code I provided?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#6

24 Jul 2023, 11:48

I have a little bit of experience doing this in some simple cases. This might be good enough to help O.P. with this specific question. But, as Tiago Pereira points out, this is a pretty complex problem in general.

The main problem with the code shown in #1 is that it creates a data set with the wrong structure. That is, the data set has been created with separate observations, some of which are risk event 1 and some of which are risk event 2. But the structure needed for simulating a competing risks model is a data set in which each observation has a time-to-event for both risk events. Then the actual event (1 or 2) and time to event are selected based on whichever time to event is less.

So my code for what I believe O.P. is trying to simulate would look like this:

Code:

clear* local n_subjects 1000 set obs `n_subjects' set seed 271828 // OR WHATEVER gen byte treatment = runiform() > 0.5 local median_time_1 12 local median_time_2 7 local treatment_hazard_ratio_1 0.7 local treatment_hazard_ratio_2 1 forvalues i = 1/2 { gen log_hazard_`i' = log(log(2)/`median_time_`i'') /// + log(`treatment_hazard_ratio_`i'')*treatment gen t`i' = -log(runiform())/exp(log_hazard_`i') } gen t_observed = min(t1, t2) gen byte event = cond(t1 < t2, 1, 2) gen censoring_time = runiform(1000, 2000) replace event = 0 if t_observed > censoring_time stset t_observed, failure(event == 1) stcox treatment stcrreg treatment, compete(event == 2)

Notes:

1. Censoring times of 1,000 to 2,000 are extremely long compared to median event times of 12 and 7. In the sample generated above, no observations end up censored by this mechanism. Indeed, it would take a gargantuan sample for one to expect even a single censoring to occur with this mechanism.

2. This is a very simple case of competing risks. In fact, since the two data-generating mechanisms for the two events are entirely independent except for the occurrence of a common covariate, treatment, in the model, one would ordinarily not even use the Fine-Gray competing risks model to analyze this data. Unsurprisingly, the hazard ratio for treatment comes out pretty much the same with both -stcox- and -stcrreg-.

3. In O.P.'s code, he seems to want to constrain the ratio of occurrence of event1 and event2 to be 19:1. But that is inconsistent with the underlying median times of 12 and 7. Putting treatment effects aside, in an exponential model, when the hazard rates are mu1 and mu2, the probability of event i will be mu_i/(mu1+mu2), i = 1, 2. So the hazards here are .058 and 0.100 (to 3 decimal places), respectively. So we would expect 36.7% of the events to be event 1, and the rest event 2. (In this sample, exactly 34% of the events are event 1.) So O.P. will either have to pick different median times to event that are consistent with a 19:1 ratio, or stick with his median times to event but accept a much different event ratio.
2 likes
Comment

Announcement

simulating competing risks aanalysis

Comment

Comment

Comment

Comment

Comment