Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • simulating competing risks aanalysis

    Hi all

    Please could you help me to figure out how to simulate competing risks survival data?

    I am aware that I can achieve this using survsim command but I would like to generate this from first principles.

    The idea is to simulate a HR for treatment for the event of interest and then have a HR for treatment for the competing risk. I want to understand how much the HR from the cox regression is biased compared to the HR from stcrreg. In the code, I tried to assume minimal censoring but this can be modified later.

    I have tried to do this with the following code but it does not seem correct. Could you help me figure out the correct code?

    From my understanding this is a key paper explaining the algorithm for the competing risks simulation which I am struggling to understand

    Beyersmann, Jan, Aurelien Latouche, Anika Buchholz, and Martin Schumacher. "Simulating competing risks data in survival analysis." Statistics in medicine 28, no. 6 (2009): 956-971.

    Code:

    Code:
    clear
    set obs 3000
    
    gen treatment = runiform()>0.5
    
    gen double u = runiform()   
    
    // 2 represents the competing risk and 1 represents the event of interest
    gen risk_events = runiform()>0.95 // competing risk, 50/50 chance
    recode risk_events 0=1 1=2
    
    tab risk_events
    
    // the hazard for the event of interest
    local mediantime = 12
    local b_cons = -ln(0.5)/`mediantime' 
    display `b_cons'
    gen xb1 = ln(`b_cons') + ln(0.70)*treatment
    gen lambda1 = exp(xb1)
    
    // the hazard for the competing risk
    local mediantime = 7
    local b_cons = -ln(0.5)/`mediantime'  
    display `b_cons'
    gen xb2 = ln(`b_cons') + ln(1)*treatment
    gen lambda2 = exp(xb2)
    
    gen t = -log(u)/lambda1 if risk_events==1
    replace t = -log(u)/lambda2 if risk_events==2
    
    // censoring time
    gen cens=runiform(1000,2000)
    
    // generate a non-informative censoring indicator
    gen event = 1        
    replace event = 0 if t>cens    
    
    gen t0 = 0
    replace t0 = t if event==1
    
    gen state1 = 1
    replace state1 = risk_events if event==1
    replace state1 = 0 if event==0
    tab event state1, col
    
    gen t1 = t               
    replace t1 = cens if event==0
    
    stset t1, fail(state1=1)
    
    sts graph, by(treatment)
    
    stcox treatment
    stcrreg treatment, compete(state1==2)
    Thanks

    A

  • #2
    Check this article:

    https://journals.sagepub.com/doi/pdf...6867X221083853

    Comment


    • #3
      Originally posted by Tiago Pereira View Post
      Yes, I know this article, but uses survsim, I would like to know how to simulate the data without survsim

      Comment


      • #4
        Simulating competing risks models is a real maze, and even a tiny detail can mess up the whole setup. If you want to give it a shot from scratch, it is really good to learn the details, but be ready to put in a lot of time. Unfortunately, there's no easy "correct code" you can just plug in and play.
        Last edited by Tiago Pereira; 19 Jul 2023, 10:34.

        Comment


        • #5
          Originally posted by Tiago Pereira View Post
          Simulating competing risks models is a real maze, and even a tiny detail can mess up the whole setup. If you want to give it a shot from scratch, it is really good to learn the details, but be ready to put in a lot of time. Unfortunately, there's no easy "correct code" you can just plug in and play.
          Could you help me with the code I provided?

          Comment


          • #6
            I have a little bit of experience doing this in some simple cases. This might be good enough to help O.P. with this specific question. But, as Tiago Pereira points out, this is a pretty complex problem in general.

            The main problem with the code shown in #1 is that it creates a data set with the wrong structure. That is, the data set has been created with separate observations, some of which are risk event 1 and some of which are risk event 2. But the structure needed for simulating a competing risks model is a data set in which each observation has a time-to-event for both risk events. Then the actual event (1 or 2) and time to event are selected based on whichever time to event is less.

            So my code for what I believe O.P. is trying to simulate would look like this:
            Code:
            clear*
            
            local n_subjects 1000
            set obs `n_subjects'
            
            set seed 271828 // OR WHATEVER
            
            gen byte treatment = runiform() > 0.5
            
            local median_time_1 12
            local median_time_2 7
            
            local treatment_hazard_ratio_1 0.7
            local treatment_hazard_ratio_2 1
            
            forvalues i = 1/2 {
                gen log_hazard_`i' = log(log(2)/`median_time_`i'') ///
                    + log(`treatment_hazard_ratio_`i'')*treatment
                gen t`i' = -log(runiform())/exp(log_hazard_`i')
            }
            
            gen t_observed = min(t1, t2)
            gen byte event = cond(t1 < t2, 1, 2)
            
            gen censoring_time = runiform(1000, 2000)
            replace event = 0 if t_observed > censoring_time
            
            stset t_observed, failure(event == 1)
            
            stcox treatment
            stcrreg treatment, compete(event == 2)
            Notes:

            1. Censoring times of 1,000 to 2,000 are extremely long compared to median event times of 12 and 7. In the sample generated above, no observations end up censored by this mechanism. Indeed, it would take a gargantuan sample for one to expect even a single censoring to occur with this mechanism.

            2. This is a very simple case of competing risks. In fact, since the two data-generating mechanisms for the two events are entirely independent except for the occurrence of a common covariate, treatment, in the model, one would ordinarily not even use the Fine-Gray competing risks model to analyze this data. Unsurprisingly, the hazard ratio for treatment comes out pretty much the same with both -stcox- and -stcrreg-.

            3. In O.P.'s code, he seems to want to constrain the ratio of occurrence of event1 and event2 to be 19:1. But that is inconsistent with the underlying median times of 12 and 7. Putting treatment effects aside, in an exponential model, when the hazard rates are mu1 and mu2, the probability of event i will be mu_i/(mu1+mu2), i = 1, 2. So the hazards here are .058 and 0.100 (to 3 decimal places), respectively. So we would expect 36.7% of the events to be event 1, and the rest event 2. (In this sample, exactly 34% of the events are event 1.) So O.P. will either have to pick different median times to event that are consistent with a 19:1 ratio, or stick with his median times to event but accept a much different event ratio.

            Comment

            Working...
            X