Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kalpan Meier Curve

    Hello everyone.
    I am working on Cox regression using a panel dataset. My observation is 4029 with 255 firms. The following is a part of dataset what I used
    input float C_ID int Time double Informality byte Failure double ROA
    1 18 1.02 0 9.68
    1 19 -.42 0 11.37
    1 20 .39 0 15.66
    1 21 .24 0 16.66
    1 22 .04 0 11.86
    1 23 .02 0 11.99
    1 24 .01 0 12.97
    1 25 -.05 0 6.92
    1 26 .02 0 3.83
    1 27 -.34 0 2.99
    1 28 .35 0 7.06
    1 29 .23 0 12.99
    1 30 .01 0 12.13
    1 31 .07 0 14.15
    1 32 -.02 0 12.61
    1 33 .09 0 8.05
    1 34 -.03 0 5.02
    2 56 -.22 0 6.43
    2 57 -.42 0 6.72
    2 58 .39 0 7.2
    2 59 .24 0 8.74
    2 60 .04 0 9.47
    2 61 .02 0 5.86
    2 62 .01 0 .95
    2 63 -.05 0 2.15
    2 64 .02 0 1.47
    2 65 -.34 0 1.93
    2 66 .35 0 2.29
    2 67 .23 0 3.54
    2 68 .01 0 3.89
    2 69 .07 0 4.37
    2 70 -.02 0 6.31
    2 71 .09 0 3.79
    2 72 -.03 0 3.04
    3 6 -.36 0 73.54
    3 7 -.34 0 23.98
    3 8 .39 0 20.96
    3 9 .25 0 16.02
    3 10 .05 0 36.97
    3 11 -.02 0 22.29
    3 12 .04 0 14.62
    3 13 0 0 12.84
    3 14 -.07 0 14.62
    3 15 -.49 0 27.82
    3 16 .7 0 73.17
    3 17 .3 0 27.63
    3 18 -.04 0 49.05
    3 19 .02 0 21.81
    3 20 .08 0 44.49
    3 21 .13 0 23.93
    3 22 -.04 0 30.87
    4 9 .01 0 35.63
    4 10 -.29 0 6.57
    4 11 .21 0 11.42
    4 12 .22 0 15.86
    4 13 .01 0 15.36
    4 14 -.01 0 23.5
    4 15 -.02 0 5.01
    4 16 -.02 0 5.61
    4 17 .02 0 8.91
    4 18 -.4 0 10.87
    4 19 .37 0 6.54
    4 20 .25 0 4.58
    4 21 -.01 0 5.19
    4 22 .02 0 23.51
    4 23 -.01 0 35.08
    4 24 .03 0 34.43
    4 25 -.04 0 35.42
    5 61 .04 0 13.53
    5 62 -.39 0 6.81
    5 63 .33 0 4.36
    5 64 .24 0 4.39
    5 65 .01 0 3.3
    5 66 -.01 0 3.47
    5 67 -.02 0 3.43
    5 68 0 0 5.78
    5 69 0 0 5.92
    5 70 -.49 0 5.35
    5 71 .58 0 6.41
    5 72 .26 0 6.6
    5 73 0 0 13.14
    5 74 0 0 18
    5 75 .01 0 316.4
    5 76 .05 0 351.51
    5 77 -.03 0 365.06
    6 9 -.02 0 15.57
    6 10 -.39 0 15.68
    6 11 .33 0 4.44
    6 12 .24 0 6.06
    6 13 .01 0 6.98
    6 14 -.01 0 6.8
    6 15 -.02 0 8.44
    6 16 0 0 9.67
    6 17 0 0 9.75
    6 18 -.49 0 5.94
    6 19 .58 0 4.29
    6 20 .26 0 8.9
    6 21 0 0 14.47
    6 22 0 0 12.38
    6 23 .01 0 2.08
    the main independent variable is informality; failure is the event for firm liquidation, and time is the time to liquidation. ROA is firm profit. Here failure event coded as 1 is not visible and it comes later. I tried to plot the Kalpan Meier curves after Cox regression. But the graph does not look appropriate. Kindly help. Also, there are two groups based on firm resource diversification. Please help me understand what is wrong with the graph and why. It will be very helpful as this is my thesis work.
    Attached Files

  • #2
    Anuradha:
    I'm under the impression that the two KM curve do overlap before drifting apart from, say, 65 onwards.
    Why thyy do behave this way, I do not know, and your excerpt (with no Failure=1) does not help, unfortunately.
    Code:
    . sum F
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
         Failure |        100           0           0          0          0
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you Carlo Lazzaro . I do not know myself too, but my number of firms undergoing failure as in coded as failure=1, is only 20. Could a small sample of failure firms be the cause ?

      Comment


      • #4
        Anuradha:
        yes, it might be.
        What does the logrank test tell you?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Anuradha Saikia View Post
          I tried to plot the Kaplan Meier curves after Cox regression. But the graph does not look appropriate. Please help me understand what is wrong with the graph and why.
          What is it that you think is "not appropriate" or "wrong"? Without more information we can only guess. Carlo, I believe, guessed that you didn't expect survival to be so high and it seems that this is simply due to the number of failures being relatively small.

          Or is the problem that you wanted to plot curves based on the fitted Cox model and therefore expected them to follow a similar pattern due to the proportional hazards assumption? Here's an example (from the online help) of plotting curves based on a fitted Cox model:

          Code:
          webuse drugtr
          stcox age drug
          stcurve, survival at1(drug=0) at2(drug=1)
          You will see that the default graph title (at least in version 18) is "Cox proportional hazards regression". The title in your graph is what one gets by default with:

          Code:
          sts graph, by(drug)
          These are empirical curves and have nothing to do with the Cox regression model you fitted. You'll get the same curves without fitting a Cox model.

          Please, explain why you think the curve you got is wrong (i.e., how does it differ from what you expected) and also show the code you used.

          Comment


          • #6
            Thank you Paul Dickman. I am posting the code of the cox regression which is used on my panel data .
            Code:
             stcox Informality  POLRISK HHI SIZE RD ESG SG Sub_Diversification, vce(robust)
            Here informality is my main independent variable . The output is what I got as below

            Code:
                     failure _d:  Failure
               analysis time _t:  Time
            
            Iteration 0:   log pseudolikelihood = -140.81549
            Iteration 1:   log pseudolikelihood = -132.50115
            Iteration 2:   log pseudolikelihood = -132.03338
            Iteration 3:   log pseudolikelihood = -132.02234
            Iteration 4:   log pseudolikelihood = -132.02234
            Refining estimates:
            Iteration 0:   log pseudolikelihood = -132.02234
            
            Cox regression -- Breslow method for ties
            
            No. of subjects      =        4,023             Number of obs    =       4,023
            No. of failures      =           20
            Time at risk         =        99038
                                                            Wald chi2(8)     =       41.64
            Log pseudolikelihood =   -132.02234             Prob > chi2      =      0.0000
            
            -------------------------------------------------------------------------------------
                                |               Robust
                             _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
            --------------------+----------------------------------------------------------------
                    Informality |   .3067587   .1551332    -2.34   0.019     .1138491    .8265409
                        POLRISK |   1.61e+07   1.45e+08     1.84   0.066     .3240018    7.97e+14
                            HHI |    1.23364   .5463154     0.47   0.635     .5178869    2.938608
                           SIZE |   .9017934   .0406553    -2.29   0.022     .8255295    .9851028
                             RD |   1.085062   .0376696     2.35   0.019     1.013687    1.161463
                            ESG |   1.042232   .0173778     2.48   0.013     1.008723    1.076855
                             SG |   .8022922   .6836328    -0.26   0.796     .1510164     4.26227
            Sub_Diversification |   .5433092   .3179976    -1.04   0.297     .1725205    1.711013
            -------------------------------------------------------------------------------------
            .

            THEN I used the following code for Kaplan Maeier failure curve
            Code:
              sts graph, failure
            .

            Comment


            • #7
              I get the following Kaplan Maeir curve which I am not able to interpret.
              Attached Files

              Comment


              • #8
                The "failure curve" is one minus survival. It shows the estimated probability of failure as a function of time. There are relatively few failures (only 20) in your data so the the probabilities are low.

                Note that the curve is completely independent of the Cox model. You would get the same curve without ever fitting a Cox model.

                Comment


                • #9
                  Okay Paul Dickman . I understand the concern is with my less failure firms .

                  Comment

                  Working...
                  X