Survival Model with "treatments" at some point in time

Doro Kuebler

Join Date: Feb 2016

Posts: 11
#1

Survival Model with "treatments" at some point in time

23 Feb 2016, 03:42

Dear all,

I was glad to see that Stata 14 offers panel survival models, I am however unsure whether this is exactly what I need. I know this is not Stata-specific, but if you could advise me on the framework to look for, I'd be very happy. I want to model the following thing:

I observe people over a certain time period, one observation per month. I want to find the determinants of how long I observe them before they are "dead" (not really dead). In particular, I want to examine the effect of some event e happening to them on the probability of their death. This event e happens on different dates for different people and does not happen to all. But if it happens, then only once. this means: once dead, always dead.

In ordinary survival models with one observation per person I cannot properly model the time dynamics, correct? So that's why I thought panel survival models are interesting, but then Statacorp says that this is really only for multiple deaths per person.

I might need to add that my other covariates (not only treatment e 0/1) change over time, but also the others, so I really feel like I need to introduce the time structure on the analysis.

Can anyone give me some guidance on this issue?
Thank you,

Doro
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

23 Feb 2016, 06:31

Hello Doro,

Welcome to the Stata Forum.

You didn't give much information over the format of your data, the distribution of the variables and, particularly, on the "time" variable, apart from being a monthly exam. Is it discrete? How many examinations per person? Everybody starts at the same time?

That said, I wonder if a recent thread wouldn't apply to your case:http://www.statalist.org/forums/foru...vival-analysis

Hopefully that helps.

Best,

Marcos

Best regards,

Marcos
Comment
Doro Kuebler

Join Date: Feb 2016

Posts: 11
#3

23 Feb 2016, 08:18

hello Marcos. thanks for getting back so quickly.

time is discrete in my dataset.
the people are observed monthly for a few months without info on the possible treatment, and then up to 24 monthly observations after the treatment variable is available.

in this sense, the link you provided and especially the included link to http://www.stata.com/manuals13/stdiscrete.pdf goes to the right direction, yes.

So it boils down to some kind of a panel logit model.

1) However, I wonder whether it wouldn't result in an omitted variable bias to not include a lag of the dependent variable as long as the spell lasts?
2) I am not sure how to deal with the left censoring of the treatment variable: just start the sample at the time all observations become available, as I had planned? Unfortunately, my problem here also resembles the thread you mentioned and where there was no further help/update given how to proceed

Last edited by Doro Kuebler; 23 Feb 2016, 08:36.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

23 Feb 2016, 11:21

Hello Doro,

Thanks for reading the thread shared in #2. Indeed, it has much to do with your query, considering the paucity of details.

That said, to cope with the lag "within" treatment groups, I mean, the "pre-post" split, whatsoever the "treatment" is, I believe a newly created dummy variable might well do the trick.

Maybe you'll get further help in this Forum if, according to the recommendations in the FAQ, there is more information on your study design, including sample size, missing data, patterns of the variables, number of "zero" event, censoring, etc., even if it is just taken from a basic "start-up" model.

Best,

Marcos

Best regards,

Marcos
Comment
Doro Kuebler

Join Date: Feb 2016

Posts: 11
#5

24 Feb 2016, 06:36

okay so I try giving more details. that I did not so far is due to the fact that I did not want to further complicate my post and because the data is confidential.

people either report to an institution, or they don't. a combination people-institution will be called a relationship. there are around 60,000 relationships from 17,000 people to 2,000 institutions. reportings are monthly. sample period is 2010-2014 . an important variable becomes available only after 2012: I see whether the people experience participation in a special program.

relationships (i.e. reportings) can start and stop monthly, for instance before 2010, 2010-2012, 2013-.... (this is the bad part I guess because of late entry + data availability 2010-2012). Once a relationship stopped, it never returns. However, an important feature is that a person may keep reporting to institution A while stopping to report to institution B, i.e. it reduces the amount of relationships. increases are less common, but also possible.

I now want to model the person-specific factors that determine existence of a relationship to an institution (i.e. reporting or not reporting in a given month) AS WELL AS the remaining time a relationship exists at some point in time t. Given the links you provided, Marcos, I specifically wonder whether I need a frailty model. In my literature, fixed effects are widespread and thought of as adressing unobserved hetreogeneity adequately. I understand that fixed effects may lead to multiplicative biases when inferring hazards from a panel logit model described in stdiscrete.pdf. However,

q1) what counts as a fixed effect here? obviously including dummy variables for every person would. But also simple male/female indicator, what that already be a "problematic fixed effect"?

I might then choose to either just use a subsample of relationships that start at a given point, to not have late entry (I don't see myself writing my own ml routine as proposed in the manual) and use in a frailty model, OR use the full late-entry sample and estimate a non-frailty model. I guess the answer for me depends on the answer to q1), if I feel I can control for cross-sectional variation enough with some indicator variables and still not run into the fixed-effects problem, that would be great!

specifically, I want to see whether there are differences between time-invariant characteristics of people that influence their probability of stopping the reporting after the treatment. Sort of an interaction. I guess here comes the next problem of interactions in binary choice models being problematic as well.

Last edited by Doro Kuebler; 24 Feb 2016, 06:40.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

24 Feb 2016, 06:51

Hello Doro,

This model is far beyond the ones I'm used to, since it seems you have recurrent events. Maybe you'll get furher advice from the Forum members. Personally, as a start up, and considering time is discrete, I'd think about - xtlogit - and - xtcloglog - models.

Best,

Marcos

Best regards,

Marcos
Comment
Doro Kuebler

Join Date: Feb 2016

Posts: 11
#7

24 Feb 2016, 07:23

but this is not really recurrent events, right? Recurrent events in my case would be the same relationship starting and stopping over time multiple times. that is not the case
Comment

Announcement

Survival Model with "treatments" at some point in time

Comment

Comment

Comment

Comment

Comment

Comment