Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kaplan-Meier Survival Functions Only Provided for Common Area

    Dear members of Statalist,

    I have a question that I have been unable to find an answer for, in any related post in this or another forum. I use Stata/SE 16.1 for Windows.
    I have a panel dataset with some 4,600 observations, observing 257 non-state actors over time (in years) and I want to find out which variables increase the likelihood that a non-state actor starts to use violent force. In a first step, I want to assess the Kaplan-Meier survival estimates for each of my independent variables, being dummies with a treatment and a control group. However, I only obtain the survival functions for the common time period of treatment and control group and not for the entire time although non-state actors in one group may be still at risk.
    Below is an example for autonomous resources that a non-state actor can mobilise. Autonomous is a dummy with 0 = no autonomous resources and 1 = autonomous resources. In order to obtain the survival function for each failure time, I enter:

    Code:
    . sts list, by(autonomous)
     
    failure _d:  civil_1_2 == 1
    analysis time _t:  (year-origin)
    origin:  time date0
    enter on or after:  time date0
    id:  id
     
                 At           Net    Survivor      Std.
      Time     Risk   Fail   Lost    Function     Error     [95% Conf. Int.]
    ------------------------------------------------------------------------
    autonomous=0
         1      230      8      7      0.9652    0.0121     0.9317    0.9825
         2      215      7      8      0.9338    0.0165     0.8926    0.9596
         3      200      6      7      0.9058    0.0196     0.8591    0.9376
         4      187      8     11      0.8670    0.0231     0.8142    0.9057
         5      168      2      8      0.8567    0.0239     0.8023    0.8971
         6      158      2      8      0.8459    0.0248     0.7898    0.8880
         7      148      1      2      0.8401    0.0253     0.7832    0.8833
         8      145      0      5      0.8401    0.0253     0.7832    0.8833
         9      140      2      7      0.8281    0.0263     0.7692    0.8732
        10      131      0     13      0.8281    0.0263     0.7692    0.8732
        11      118      1     11      0.8211    0.0270     0.7608    0.8675
        12      106      0      6      0.8211    0.0270     0.7608    0.8675
        13      100      0     10      0.8211    0.0270     0.7608    0.8675
        14       90      0     12      0.8211    0.0270     0.7608    0.8675
        15       78      1      5      0.8106    0.0286     0.7468    0.8598
        16       72      1     12      0.7993    0.0304     0.7318    0.8516
        17       59      0      6      0.7993    0.0304     0.7318    0.8516
        18       53      0     10      0.7993    0.0304     0.7318    0.8516
        19       43      0      1      0.7993    0.0304     0.7318    0.8516
        20       42      1      4      0.7803    0.0351     0.7019    0.8404
        22       37      2      1      0.7381    0.0441     0.6399    0.8134
        23       34      0      3      0.7381    0.0441     0.6399    0.8134
        25       31      1      1      0.7143    0.0487     0.6063    0.7976
        26       29      0      3      0.7143    0.0487     0.6063    0.7976
        28       26      0      1      0.7143    0.0487     0.6063    0.7976
        30       25      0      3      0.7143    0.0487     0.6063    0.7976
        32       22      0      1      0.7143    0.0487     0.6063    0.7976
        35       21      1      1      0.6803    0.0570     0.5543    0.7777
        36       19      0      1      0.6803    0.0570     0.5543    0.7777
        37       18      0      1      0.6803    0.0570     0.5543    0.7777
        39       17      0      1      0.6803    0.0570     0.5543    0.7777
        40       16      1      0      0.6378    0.0675     0.4901    0.7530
        41       15      0      1      0.6378    0.0675     0.4901    0.7530
        44       14      0      1      0.6378    0.0675     0.4901    0.7530
    autonomous=1
         1       27      5     -7      0.8148    0.0748     0.6109    0.9184
         2       29      2     -1      0.7586    0.0795     0.5594    0.8769
         3       28      0     -1      0.7586    0.0795     0.5594    0.8769
         4       29      2      4      0.7063    0.0821     0.5118    0.8348
         5       23      2     -1      0.6449    0.0857     0.4518    0.7849
         7       22      0      6      0.6449    0.0857     0.4518    0.7849
         8       16      2      2      0.5643    0.0920     0.3678    0.7209
         9       12      0      2      0.5643    0.0920     0.3678    0.7209
        12       10      0      1      0.5643    0.0920     0.3678    0.7209
        14        9      0     -1      0.5643    0.0920     0.3678    0.7209
        16       10      0      2      0.5643    0.0920     0.3678    0.7209
        17        8      0      2      0.5643    0.0920     0.3678    0.7209
        18        6      0      2      0.5643    0.0920     0.3678    0.7209
        20        4      0     -1      0.5643    0.0920     0.3678    0.7209
        23        5      0      1      0.5643    0.0920     0.3678    0.7209
        26        4      0     -1      0.5643    0.0920     0.3678    0.7209
        29        5      0      1      0.5643    0.0920     0.3678    0.7209
        35        4      0     -1      0.5643    0.0920     0.3678    0.7209
        36        5      0      1      0.5643    0.0920     0.3678    0.7209
        41        4      0      1      0.5643    0.0920     0.3678    0.7209
        42        3      0      1      0.5643    0.0920     0.3678    0.7209
        43        2      0      1      0.5643    0.0920     0.3678    0.7209
        45        1      0      1      0.5643    0.0920     0.3678    0.7209
    ------------------------------------------------------------------------
    Note: Net Lost equals the number lost minus the number who entered.
    From the output it becomes clear that for non-state actors without autonomous resources, 13 remain in the risk pool after year 44. However, the output does not show me their survivor functions anymore. I also double-checked by browsing my data that there are still non-state actors in the study and some even fail, which is also depicted by the respective plot below. It is true though that the last non-state actor that does mobilise autonomous resources leaves the study in year 45, being censored.


    Code:
    sts graph, by(autonomous) ///
    legend(cols(1) label(1 "No Autonomous Resources") label(2 "Autonomous Resources") ///
    size(small) region(style(none))) ytitle("Probability of Survival") ///
    xtitle("Analysis Time (in Years)") xlabel(0 (10) 90) ///
    title("Autonomous Resources")
     
    failure _d:  civil_1_2 == 1
    analysis time _t:  (year-origin)
    origin:  time date0
    enter on or after:  time date0
    id:  id
    Click image for larger version

Name:	Graph.jpg
Views:	1
Size:	24.8 KB
ID:	1624111



    You can see from the figure that after year 44 non-state actors without autonomous resources are still at risk and some even fail in year 46 and year 51. However, I do not obtain the KM-survival functions beyond the common area. But Stata must know them, otherwise it would be unable to draw the graph. The same applies when I use the Nelson-Aalen cumulative hazard estimator. I have that same problem for a number of other variables. Is there a reason for that behaviour or am I simply ignoring some basic fact? I would very much appreciate your assistance.

    Thanks, Tom.
    Last edited by Tom Konzack; 19 Aug 2021, 14:11.

  • #2
    Dear Statalist users,

    I figured that the reason for the absence of any response may be that my problem above was not easily reproducible. Therefore, below I provide an arbitrary panel dataset that illustrates my question. Here is the code to replicate the issue:

    Code:
    clear
    input id str10 date0 str10 date1 x1 event
    1    01jan2000    31dec2000    1    1
    2    01jan2000    31dec2000    1    0
    2    01jan2001    31dec2001    1    1
    3    01jan2000    31dec2000    1    0
    3    01jan2001    31dec2001    1    0
    3    01jan2002    31dec2002    1    1
    4    01jan2000    31dec2000    1    0
    4    01jan2001    31dec2001    1    1
    5    01jan2000    31dec2000    0    0
    5    01jan2001    31dec2001    1    0
    5    01jan2002    31dec2002    1    1
    6    01jan2000    31dec2000    0    0
    6    01jan2001    31dec2001    0    0
    6    01jan2002    31dec2002    0    0
    6    01jan2003    31dec2003    0    0
    6    01jan2004    31dec2004    0    0
    6    01jan2005    31dec2005    0    1
    7    01jan2000    31dec2000    0    0
    7    01jan2001    31dec2001    0    0
    7    01jan2002    31dec2002    0    0
    7    01jan2003    31dec2003    0    1
    8    01jan2000    31dec2000    0    0
    8    01jan2001    31dec2001    0    0
    8    01jan2002    31dec2002    0    0
    8    01jan2003    31dec2003    0    0
    8    01jan2004    31dec2004    0    0
    8    01jan2005    31dec2005    0    0
    end
    
    
    gen start = date(date0, "DMY")
    format start %ty
    
    gen end = date(date1, "DMY")
    format end %ty
    
    stset end, id(id) time0(start) origin(time start) failure(event == 1) 
    
    br id start end x1 event _t _t0 _st _d
    
    sts list, by(x1)
    
    failure _d:  event == 1
    analysis time _t:  (end-origin)
    origin:  time start
    id:  id
    
                 At           Net    Survivor      Std.
      Time     Risk   Fail   Lost    Function     Error     [95% Conf. Int.]
    ------------------------------------------------------------------------
    x1=0 
       365        4      0      4      1.0000         .          .         .
       366        0      0     -3      1.0000         .          .         .
       730        3      0      3      1.0000         .          .         .
       731        0      0     -3      1.0000         .          .         .
      1095        3      0      3      1.0000         .          .         .
    x1=1 
       365        4      1      3      0.7500    0.2165     0.1279    0.9605
       366        0      0     -4      0.7500    0.2165     0.1279    0.9605
       730        4      2      2      0.3750    0.2165     0.0446    0.7339
       731        0      0     -2      0.3750    0.2165     0.0446    0.7339
      1095        2      2      0      0.0000         .          .         .
    ------------------------------------------------------------------------
    Note: Net Lost equals the number lost minus the number who entered.

    As you can see from the table provided for the Kaplan-Meier survival estimates, it only covers the time period until there are observations for for both the control group (x=0) and the treatment group (x=1). It does not show the failures and survival function for the control group once there are no more observations for the treatment group. However, plotting the survival function below shows that there are indeed failures in the control group.


    Code:
    sts graph, by(x1)
    Click image for larger version

Name:	Graph_2.jpg
Views:	1
Size:	22.8 KB
ID:	1624316


    Could anyone please tell me how I can obtain the entire survival function in the table for both groups?

    Thanks for your support. Tom.

    Comment


    • #3
      The following output is from Stata 16.1

      Code:
      . sts list, by(x1) risktable
      
               failure _d:  event == 1
         analysis time _t:  (end-origin)
                   origin:  time start
                       id:  id
      
                   At           Net    Survivor      Std.
        Time     Risk   Fail   Lost    Function     Error     [95% Conf. Int.]
      ------------------------------------------------------------------------
      x1=0 
         365        4      0      4      1.0000         .          .         .
         366        0      0     -3      1.0000         .          .         .
         730        3      0      3      1.0000         .          .         .
         731        0      0     -3      1.0000         .          .         .
        1095        3      0      3      1.0000         .          .         .
        1096        0      0     -3      1.0000         .          .         .
        1460        3      1      2      0.6667    0.2722     0.0541    0.9452
        1461        0      0     -2      0.6667    0.2722     0.0541    0.9452
        1826        2      0      2      0.6667    0.2722     0.0541    0.9452
        1827        0      0     -2      0.6667    0.2722     0.0541    0.9452
        2191        2      1      1      0.3333    0.2722     0.0090    0.7741
      x1=1 
         365        4      1      3      0.7500    0.2165     0.1279    0.9605
         366        0      0     -4      0.7500    0.2165     0.1279    0.9605
         730        4      2      2      0.3750    0.2165     0.0446    0.7339
         731        0      0     -2      0.3750    0.2165     0.0446    0.7339
        1095        2      2      0      0.0000         .          .         .
      ------------------------------------------------------------------------
      Note: Net Lost equals the number lost minus the number who entered.

      Comment


      • #4
        Dear Paul,

        thanks so much, that is the solution. I already assumed that there must be a simple way to visualise the survivor functions for all failure times of both groups and "risktable" does exactly that. I wonder though why Stata does not do that by default.

        Many thanks to you.
        Best, Tom.

        Comment

        Working...
        X