I'm wondering about the best way to do a time series model where the outcome is binary (depressed/not depressed), and there are three waves.
Max
- The simplest approach seems to be a random effects model: "xtlogit depressed X X2 X3, re." But some of my regressors only have data for 2 out of 3 waves, so if I include them, Stata won't use all three waves of data (I can tell because the output says the "max" number of obs per group is 2). Is there any way to fix this—i.e., to use the max number of available waves per regressor?
- One potential option is to type: "logit L0.depressed L(0/2).X L(0/2).X2 L(0/1).X3, where X3 is the variable that doesn't have any data in wave one. But would that work? Would it produce unbiased estimates?
- Another option is to observe how a change in X affects the change in depression: "logit D(0/1).depressed L(0/2).X L(0/2).X2 L(0/1).X3." But can this model account both for people who become depressed (0 --> 1) and people who become undepressed (1 --> 0)? It seems like the "logit" command wouldn't work here since the values can be either -1, 0, or 1 depending on whether the person became depressed, undepressed, or remained the same, whereas logit assumes a binary outcome.
- Do any of these models help reject the possibility of reverse causality? For example, if the random effects model shows that X–X3 really are associated with depression, can I know that they lead to depression rather than depression leading to them?
Max
Comment