Snapspan

There is a problem with using snapspan in your situation, but the problem goes a little beyond what you've described.. According to the Manual entry for stset, ""Snapshot data can be converted to survival-time daa if we are willing to assume that x1 and x2 remained constant between times..".

Your data violate that assumption. What's the reason that snapspan shouldn't be used in your situation? The problem is that to properly use stcox with time-varying covariates, at every failure time, one needs covariate values at that time. With covariates that can change after measurement dates, you won't have such values. In detail: Let \(i\) index individuals, \(i = 1\dots n\). For individual \(i\), let \(x_i(t)\) denote the value of covariate \(x\) at time \(t\). This will be known only if \(t\) is one of the times at which measurements were made for that individual.

Let \(t^*_j\) denote the ordered failure times in the data set. For every observation i at risk at \(t^*_j\), you will need a prediction of \(x_i(t^*_j)\). Add these predicted points to the data as if they were "real" measurements. Then use snapspan.

How you get this prediction will depend on what you know about the behavior of \(x(t\)). A full approach would require modeling and multiple-imputation, but for continuous measurements interpolation and extrapolation (ipolate) might be sufficient.

There are some studies for which the most recently measured covariate values might be acceptable. This would be the case if you wanted to get early prediction of events from the measured covariates. You might then define the question as: would \(x(t)\) predict risk of an event within a certain interval after \(t\).

For completeness, I note that there are other studies in which the use of most time-dependent covariates is likely to be inadvisable. In a randomized clinical trial for example, many indicators measured post-randomization might be on the causal pathway; control for such indicators would be over-control.

Announcement

Leave a comment: