Dear members of Statalist,
I have a question that I have been unable to find an answer for, in any related post in this or another forum. I use Stata/SE 16.1 for Windows.
I have a panel dataset with some 4,600 observations, observing 257 non-state actors over time (in years) and I want to find out which variables increase the likelihood that a non-state actor starts to use violent force. In a first step, I want to assess the Kaplan-Meier survival estimates for each of my independent variables, being dummies with a treatment and a control group. However, I only obtain the survival functions for the common time period of treatment and control group and not for the entire time although non-state actors in one group may be still at risk.
Below is an example for autonomous resources that a non-state actor can mobilise. Autonomous is a dummy with 0 = no autonomous resources and 1 = autonomous resources. In order to obtain the survival function for each failure time, I enter:
From the output it becomes clear that for non-state actors without autonomous resources, 13 remain in the risk pool after year 44. However, the output does not show me their survivor functions anymore. I also double-checked by browsing my data that there are still non-state actors in the study and some even fail, which is also depicted by the respective plot below. It is true though that the last non-state actor that does mobilise autonomous resources leaves the study in year 45, being censored.

You can see from the figure that after year 44 non-state actors without autonomous resources are still at risk and some even fail in year 46 and year 51. However, I do not obtain the KM-survival functions beyond the common area. But Stata must know them, otherwise it would be unable to draw the graph. The same applies when I use the Nelson-Aalen cumulative hazard estimator. I have that same problem for a number of other variables. Is there a reason for that behaviour or am I simply ignoring some basic fact? I would very much appreciate your assistance.
Thanks, Tom.
I have a question that I have been unable to find an answer for, in any related post in this or another forum. I use Stata/SE 16.1 for Windows.
I have a panel dataset with some 4,600 observations, observing 257 non-state actors over time (in years) and I want to find out which variables increase the likelihood that a non-state actor starts to use violent force. In a first step, I want to assess the Kaplan-Meier survival estimates for each of my independent variables, being dummies with a treatment and a control group. However, I only obtain the survival functions for the common time period of treatment and control group and not for the entire time although non-state actors in one group may be still at risk.
Below is an example for autonomous resources that a non-state actor can mobilise. Autonomous is a dummy with 0 = no autonomous resources and 1 = autonomous resources. In order to obtain the survival function for each failure time, I enter:
Code:
. sts list, by(autonomous) failure _d: civil_1_2 == 1 analysis time _t: (year-origin) origin: time date0 enter on or after: time date0 id: id At Net Survivor Std. Time Risk Fail Lost Function Error [95% Conf. Int.] ------------------------------------------------------------------------ autonomous=0 1 230 8 7 0.9652 0.0121 0.9317 0.9825 2 215 7 8 0.9338 0.0165 0.8926 0.9596 3 200 6 7 0.9058 0.0196 0.8591 0.9376 4 187 8 11 0.8670 0.0231 0.8142 0.9057 5 168 2 8 0.8567 0.0239 0.8023 0.8971 6 158 2 8 0.8459 0.0248 0.7898 0.8880 7 148 1 2 0.8401 0.0253 0.7832 0.8833 8 145 0 5 0.8401 0.0253 0.7832 0.8833 9 140 2 7 0.8281 0.0263 0.7692 0.8732 10 131 0 13 0.8281 0.0263 0.7692 0.8732 11 118 1 11 0.8211 0.0270 0.7608 0.8675 12 106 0 6 0.8211 0.0270 0.7608 0.8675 13 100 0 10 0.8211 0.0270 0.7608 0.8675 14 90 0 12 0.8211 0.0270 0.7608 0.8675 15 78 1 5 0.8106 0.0286 0.7468 0.8598 16 72 1 12 0.7993 0.0304 0.7318 0.8516 17 59 0 6 0.7993 0.0304 0.7318 0.8516 18 53 0 10 0.7993 0.0304 0.7318 0.8516 19 43 0 1 0.7993 0.0304 0.7318 0.8516 20 42 1 4 0.7803 0.0351 0.7019 0.8404 22 37 2 1 0.7381 0.0441 0.6399 0.8134 23 34 0 3 0.7381 0.0441 0.6399 0.8134 25 31 1 1 0.7143 0.0487 0.6063 0.7976 26 29 0 3 0.7143 0.0487 0.6063 0.7976 28 26 0 1 0.7143 0.0487 0.6063 0.7976 30 25 0 3 0.7143 0.0487 0.6063 0.7976 32 22 0 1 0.7143 0.0487 0.6063 0.7976 35 21 1 1 0.6803 0.0570 0.5543 0.7777 36 19 0 1 0.6803 0.0570 0.5543 0.7777 37 18 0 1 0.6803 0.0570 0.5543 0.7777 39 17 0 1 0.6803 0.0570 0.5543 0.7777 40 16 1 0 0.6378 0.0675 0.4901 0.7530 41 15 0 1 0.6378 0.0675 0.4901 0.7530 44 14 0 1 0.6378 0.0675 0.4901 0.7530 autonomous=1 1 27 5 -7 0.8148 0.0748 0.6109 0.9184 2 29 2 -1 0.7586 0.0795 0.5594 0.8769 3 28 0 -1 0.7586 0.0795 0.5594 0.8769 4 29 2 4 0.7063 0.0821 0.5118 0.8348 5 23 2 -1 0.6449 0.0857 0.4518 0.7849 7 22 0 6 0.6449 0.0857 0.4518 0.7849 8 16 2 2 0.5643 0.0920 0.3678 0.7209 9 12 0 2 0.5643 0.0920 0.3678 0.7209 12 10 0 1 0.5643 0.0920 0.3678 0.7209 14 9 0 -1 0.5643 0.0920 0.3678 0.7209 16 10 0 2 0.5643 0.0920 0.3678 0.7209 17 8 0 2 0.5643 0.0920 0.3678 0.7209 18 6 0 2 0.5643 0.0920 0.3678 0.7209 20 4 0 -1 0.5643 0.0920 0.3678 0.7209 23 5 0 1 0.5643 0.0920 0.3678 0.7209 26 4 0 -1 0.5643 0.0920 0.3678 0.7209 29 5 0 1 0.5643 0.0920 0.3678 0.7209 35 4 0 -1 0.5643 0.0920 0.3678 0.7209 36 5 0 1 0.5643 0.0920 0.3678 0.7209 41 4 0 1 0.5643 0.0920 0.3678 0.7209 42 3 0 1 0.5643 0.0920 0.3678 0.7209 43 2 0 1 0.5643 0.0920 0.3678 0.7209 45 1 0 1 0.5643 0.0920 0.3678 0.7209 ------------------------------------------------------------------------ Note: Net Lost equals the number lost minus the number who entered.
Code:
sts graph, by(autonomous) /// legend(cols(1) label(1 "No Autonomous Resources") label(2 "Autonomous Resources") /// size(small) region(style(none))) ytitle("Probability of Survival") /// xtitle("Analysis Time (in Years)") xlabel(0 (10) 90) /// title("Autonomous Resources") failure _d: civil_1_2 == 1 analysis time _t: (year-origin) origin: time date0 enter on or after: time date0 id: id
You can see from the figure that after year 44 non-state actors without autonomous resources are still at risk and some even fail in year 46 and year 51. However, I do not obtain the KM-survival functions beyond the common area. But Stata must know them, otherwise it would be unable to draw the graph. The same applies when I use the Nelson-Aalen cumulative hazard estimator. I have that same problem for a number of other variables. Is there a reason for that behaviour or am I simply ignoring some basic fact? I would very much appreciate your assistance.
Thanks, Tom.
Comment