Snapspan

CEdward

Join Date: Nov 2014

Posts: 131
#1

Snapspan

06 Jan 2015, 14:47

Hi all,

I wanted to confirm whether or not my data requires me to use the snapspan command. Suppose, I have IDs with multiple records and associated with each observation are dates where certain measurements are taken. In addition, there's also some failure event (e.g. cardiovascular disease). Would I need snapspan if those measurements are not constant in between dates where the measurements were taken?

Thanks.
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

08 Jan 2015, 19:10

There is a problem with using snapspan in your situation, but the problem goes a little beyond what you've described.. According to the Manual entry for stset, ""Snapshot data can be converted to survival-time daa if we are willing to assume that x1 and x2 remained constant between times..".

Your data violate that assumption. What's the reason that snapspan shouldn't be used in your situation? The problem is that to properly use stcox with time-varying covariates, at every failure time, one needs covariate values at that time. With covariates that can change after measurement dates, you won't have such values. In detail: Let \(i\) index individuals, \(i = 1\dots n\). For individual \(i\), let \(x_i(t)\) denote the value of covariate \(x\) at time \(t\). This will be known only if \(t\) is one of the times at which measurements were made for that individual.

Let \(t^*_j\) denote the ordered failure times in the data set. For every observation i at risk at \(t^*_j\), you will need a prediction of \(x_i(t^*_j)\). Add these predicted points to the data as if they were "real" measurements. Then use snapspan.

How you get this prediction will depend on what you know about the behavior of \(x(t\)). A full approach would require modeling and multiple-imputation, but for continuous measurements interpolation and extrapolation (ipolate) might be sufficient.

There are some studies for which the most recently measured covariate values might be acceptable. This would be the case if you wanted to get early prediction of events from the measured covariates. You might then define the question as: would \(x(t)\) predict risk of an event within a certain interval after \(t\).

For completeness, I note that there are other studies in which the use of most time-dependent covariates is likely to be inadvisable. In a randomized clinical trial for example, many indicators measured post-randomization might be on the causal pathway; control for such indicators would be over-control.

Last edited by Steve Samuels; 08 Jan 2015, 19:23.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

Comment