Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice regarding estimating competing risk

    If i desire to calculate the cumulative incidence of both outcome (fracture=1/no fracture=0) and death (dead=1/alive=0) in a cohort, using competing risk, does the "status" variable that i am using have to have outcome, death and missing combined (outcome = 1 dead = 2 no outcome/no death=3). I have been trying with the command below, but it wont seem to work:
    Code:
    gen outcome = had_fracture
    gen endpoint = min(first_fracture_date, end_follow_date)
    stset endpoint, failure(outcome = 1) scale(365.25) origin(start_follow_date)
    stcompet= , compet(dead=1)
    It is when using "stcompet= , compet(dead=1)", that i am suspecting that i need to combine outcome and dead into one variabel for it to be correct

    And then to depict the (competing) cumulated incidence of fractures and death:
    Code:
    twoway ///
    (line cummort _t if status==1 , c(J)) ///
    (line cummort _t if dead==1 , c(J))
    I cannot use dataex, because i am working with sensitive data, sorry. Please tell me if more information is needed for better comprehension, thank you

  • Clyde Schechter
    replied
    Well, my bias is to almost always do more analyses and present them all.

    Unfortunately, journals face costs associated with increasing the length of articles, so they impose limits on length. This inevitably leads authors to leave out important information, because the length limits are usually unreasonably severe and leave inadequate room for good scientific presentation. The good news is that most journals now allow authors to write a supplement to their paper which they will then host on their website. So you can present what you think are the most important analyses in the paper itself, and then refer the reader to additional anayses in the supplement. For my part, I take advantage of this, and in my recent publications my supplements are typically substantially longer than the articles themselves, but provided a much fuller picture of what was done and what the results imply.

    Leave a comment:


  • Jonas Kristensen
    replied
    Thank you for the advice!
    In the litterature, the scientists in general do not describe how they handle (censure or exclusion) patients who aquire a second stroke in the follow-up period, even though they only investigate patients with first ever stroke, so it is hard to make a decision based on prior research.

    As mentioned i am leaning towards simply censuring patients with second stroke, both in the analysis of stroke severity, civil status and incidence rate of fractures. The information on stroke severity and civil status is collected when the patient is admitted with stroke.

    My last question in regards to this is, do you think it would make sense to do a seperate analysis for stroke severity were patients with second stroke are censured, but not doing this for the other analysis. It just seems overkill, when other articles hardly describe what they do with patients with multiple strokes in terms of exclusion and censuring?

    Leave a comment:


  • Clyde Schechter
    replied
    I think your reasoning makes sense. Following a second stroke, the severity of the first stroke is probably no longer very predictive about the risk of a fracture--the severity of the later stroke becomes more salient. So censoring at the second stroke makes sense to me.

    For civil status, as long as the civil status remains unchanged, it seems reasonable to leave them uncensored at second stroke. You would, however, want to censor them at the time of any change in civil status.

    Another approach to this is to use multiple records per patient and use time-varying covariates in your Cox model. This, too, however, entails the assumption that the effect of stroke severity on fracture risk after a second stroke is the same as the effect of the same level of stroke severity after the first stroke. I don't have enough intuition about this to say if that assumption is credible or not. I can think of arguments why the same level of stroke severity might have a different effect after the second stroke, but I can also think of arguments why it might be the same. If you decide to explore this approach, you might want to discuss that with some experts in physical medicine & rehabilitation or geriatrics.

    Leave a comment:


  • Jonas Kristensen
    replied
    Thank you Mr. Schechter
    I have had some discussion regarding when and if it makes sense to censure patients when they have a second stroke.
    I am estimating the incidence rate of fractures after stroke, and then i am using cox-regression to analyse the riskfactors, stroke severity and civil status.

    The initital reasoning behind it, is that a patients risk of fracture would change drastically when having the second stroke, which would have an unwanted effect on the estimates. But some of the observationtime would also be lost if these patients are censured (approx.40 1000-person years out of 440 1000-person years).
    It is also an option to only censure patients after second stroke when doing a cox-regression analysis for stroke severity, as the patients stroke severity most likely is'nt the same after the second stroke. Then for the analysis of civil status and the estimation of incidence rate would not censure after second stroke.

    I am personally leaning to censuring after second stroke period, and then sacrificing the 40 1000-person years in risk time, but then being able to definitively have an article analysing patients from first ever stroke to fracture.

    Leave a comment:


  • Clyde Schechter
    replied
    I do agree. You should be seeing fewer failures and less person time at risk when you terminate observation at the second stroke. However, the impact on the incidence rate cannot be predicted: logically, it could increase, decrease, or stay the same.

    Leave a comment:


  • Jonas Kristensen
    replied
    I seem to have figured out what the issue was. When using the code as written in post #33, the "second_stroke_date" variable was generated as the first stroke date for each patient, so every patient got a date under "second_stroke_date". But as i have written before, I am merging several data sets, so i tried generating the "second_stroke_date" variabel before merging, which resulted in the "second_stroke_date" missing variables for all who had not had a second stroke, and then only the second stroke date for those who had a second stroke, which seems correct.

    I then get only 900 = end on or before enther in the stset command, where i got 115145 before.

    The results of calculating the incidence rate of fractures has then given rise to some new quetions, because the risk time has gone down, the amount of failures up and the IR has gone up.

    Old results without risk-time second stroke included: Person time = 441 635 failures = 16268 Incidence rate = 36.84
    Results with risk-time until second stroke included : Person time = 409 892 failures = 16555 Incidence rate = 40.39

    It seems logical to me that the risk time would go down, because some of the patients who had time until frature, now has time until second stroke, which came before their initial fracture. But the fact that the failures and the IR increase does not seem logical to me. Do you agree?

    Leave a comment:


  • Clyde Schechter
    replied
    Well, the code looks correct. And you have verified that the second stroke dates are correct. But I agree the results are not sensible. In particular, what has changed from what you were doing before is that you know consider people censored as of the date of their second stroke (if they have a second one). Other than that, everything is as it was before. Now, this should result in more censored observations and fewer failures than before. But the additional endpoints derive from second strokes, and by definition, they cannot occur before the first stroke (which is the point at which people enter the analysis).

    Without seeing the data, I don't know how to troubleshoot this. The best I can suggest is
    Code:
    browse if _st != 1 | _t < _t0
    which will show you the observations that are being excluded and perhaps by looking at those you will be able to see where things are going wrong.

    Leave a comment:


  • Jonas Kristensen
    replied
    Thank you!
    I have tried setting up the commands like so:
    Code:
    sort person_id stroke_date
    by person_id (stroke_date): gen second_stroke_date = stroke_date[2]
    by person_id (stroke_date): gen start_follow_date = stroke_date[1]
    by person_id (admission_date), sort: egen first_post_stroke_fx_date = ///
    min(cond(has_fracture_now & admission_date > start_follow_date, admission_date, .))
    by person_id: egen diagnosiscode = min(cond(admissiondate == first_post_stroke_fx_date, diag, .))
    by person_id (admission_date): gen end_follow_date = min(td(31dec2017), dødsdato, second_stroke_date)
    
    gen dead = 1
    replace dead = 0 if missing(death_date)
    by person_id: egen died = max(dead)
    by person_id: keep if _n == 1
    
    gen outcome = !missing(first_post_stroke_fx_date)
    gen endpoint = min(first_post_stroke_fx_date, end_follow_date)
    stset endpoint, failure(outcome = 1) scale(365.25) origin(start_follow_date)
    But it seems like something goes wrong as as 115.145 observations end on or before enter (see attachment).
    I have checked that the second stroke date is generated correctly, which is the case.

    Attached Files

    Leave a comment:


  • Clyde Schechter
    replied
    Yes. You first need to compute the date of the second stroke:

    Code:
    by person_id (stroke_date): gen second_stroke_date = stroke_date[2]
    The variable second_stroke_date will have a missing variable if there is only one stroke.

    Then change your end_follow_date variable code to:
    Code:
    by person_id (admission_date): gen end_follow_date = min(td(31dec2017), dødsdato, second_stroke_date)

    Leave a comment:


  • Jonas Kristensen
    replied
    Great! Thank you

    In regards to the the former code where i have only included patients with first ever stroke:
    Code:
    sort person_id stroke_date
    by person_id (stroke_date): gen start_follow_date = stroke_date[1]
    by person_id (admission_date), sort: egen first_post_stroke_fx_date = ///
    min(cond(has_fracture_now & admission_date > start_follow_date, admission_date, .))
    by person_id: egen diagnosiscode = min(cond(admissiondate == first_post_stroke_fx_date, diag, .))
    by person_id (admission_date): gen end_follow_date = min(td(31dec2017), dødsdato)
    by person_id: egen had_fracture = max(has_fracture_now) gen dead = 1 replace dead = 0 if missing(death_date)
    by person_id: egen died = max(dead)
    by person_id: keep if _n == 1
    Some of these patients/person_id's have more than one stroke in the follow-up period. Is it possible to set up the commands, so patients contribute with time at risk until their second stroke, if they do not have a fracture before that time?

    So patients are followed until fracture or second stroke or death or end of follow up. Hope that makes sense
    Last edited by Jonas Kristensen; 12 Feb 2019, 12:21.

    Leave a comment:


  • Clyde Schechter
    replied
    Yes. Change the -stset- specifications and then re-run -stdescribe-. -stset- includes an -if(exp)- option [not to be confused with adding -if whatever- to an analytic command itself] that will allow you to look at subpopupulations. And changing the -failure()- option will allow you to look at specific diagnoses.

    Leave a comment:


  • Jonas Kristensen
    replied
    Hello again
    I was was hoping that you could help me in regards to estimating patients mean time at risk until fracture for specific groups.

    I know that i can use:
    Code:
    stdescribe
    to get the mean time at risk until fracture with 95%CI, but is there a smart way to estimate the mean time at risk for example only for women or only for a specific fracture-diagnosis?

    Leave a comment:


  • Jonas Kristensen
    replied
    Thank you again Mr. Schechter!

    Leave a comment:


  • Clyde Schechter
    replied
    This is a difficult problem to which there is no simple solution. Nor is it feasible to summarize all the situations and approaches in a short post. At https://pdfs.semanticscholar.org/58d...c218e126e4.pdf , Paul Allison gives an overview of some of the commonly used approaches and their pros and cons. Reading that would be a good starting point for how to think about this and what some of your options might be.

    Leave a comment:

Working...
X