Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cox-snell residuals multiple-record-per-subject. Cox survival/event history analysis

    Hi,

    After requesting Stata to predict cox-snell residuals of a data set with multiple-record-per-subject (a.k.a. "mutiple id", where each subject--subnational states--in the study has more than one observation; time is measured in years; hence, the unit of analysis is state-year; my covariates are time dependent) and then stsetting (in order to declare the data as survival data) on the Cox-Snell Residuals in order to plot these against the cumulative hazard function, does the option "id(varname)" have to be specified?

    All examples that I have found on the web deal with single-record-per-subject data. In this case, the instructions are clear:
    . stset year, failure(adoption==1) id(state)
    . stcox varlist, cluster(state)
    .
    predict cs, csnell partial
    . stset cs, failure(adoption==1)
    . sts generate km = s
    . generate H = -ln(km)
    . line H cs cs, sort ytitle("") clstyle(. refline)

    In my case, do I have to add:
    . partial
    . id(state)

    To the 2nd & 4th lines from above (in red), respectively?

    Someone faced a similar situation in the past, but no solution was found back then. (see link)

    Thanks a lot in advance for your support!
    Victor Cruz

  • #2
    Please use CODE delimiters (FAQ Section 12). Your commands are so small, that I can't read them.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Thanks for the kind remark.

      A bit more info on my data set, I am interested in time-until-first event, indicated by the falire/censor variable "adoption", this event occurs when the former is coded as "1". Most of my covariates are time-dependent. I am Using Stata 13.

      I am interested in determining whether the model fits the data, for which plotting the Cox-Snell residuals agains the cumulative hazard is one (amongst several) tests.

      In the first code, I indicate the instructions to to the former, for data sets without multiple-record-per-subject. All examples that I have found online indicate the same steps. Unfortunately, I have not found how these instructions (especially in lines 3 and 4, in red) would change for data sets with multiple-record-per-subject.


      In the second code, I ask whether my proposed changes are adequate for my data set.

      Code:
      *stset on year, indicating that my censor variable is "adoption" and that the event occurs when the latter gets a number one
      . stset year, failure(adoption==1)
      *extimate a cox regression on several covariates
      . stcox var1 var2...
      *Request Stata to estimate the cox-snell residuals
      . predict cs, csnell
      *Stset on the generated residuals, with the same censor variable
      . stset cs, failure(adoption==1)
      * Instructions to plot against the cumulative hazard against the residuals.
      . sts generate km = s
      . generate H = -ln(km)
      . line H cs cs, sort ytitle("") clstyle(. refline)


      Code:
      /*
      If I add the multiple-record-per-subject option "id()" on syntax line 1:
      a) I have to add the "partial" option in syntax line 3, in order to get the residuals for each record within the subject*
      b) do I also have to add the "id()" option when I stset again on the generated Cox-Snell Residuals? (syntax line 4)
      */
      . stset year, failure(adoption==1) id(state)
      . stcox var1 var2, cluster(state)
      . predict cs, csnell partial
      . stset cs, failure(adoption==1) id(state)
      . sts generate km = s
      . generate H = -ln(km)
      . line H cs cs, sort ytitle("") clstyle(. refline)
      * From stcox postestimation manual p15
      "Adding the partial option will produce partial Cox–Snell residuals, one for each record within subject; see partial below. Partial Cox–Snell residuals are the additive contributions to a subject’s overall Cox Snell residual. In single-record data, the partial Cox–Snell residuals are the Cox–Snell residuals."

      *same manual, p4
      "partial is relevant only for multiple-record data and is valid with mgale, csnell, deviance, ldisplace, lmax, scores, esr, and dfbeta. Specifying partial will produce “partial” versions of these statistics, where one value is calculated for each record instead of one for each subject. The subjects are determined by the id() option to stset."
      Victor Cruz

      Comment

      Working...
      X