Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • cumulative hazard graph

    Hi there
    My understanding is that cumulative hazard is basically the inverse of cumulative survival.

    However this does not appear to be the case when I generate a survival curve and a cumulative hazard curve using exactly the same data and very similar commands.

    Survival analysis:
    command: sts graph, xlab(0(1)15) ylab(0(0.1)1.0, angle(horizontal))
    graph: "survival curve" (attached)

    Cumulative hazard analysis:
    command: sts graph, cumhaz xlab(0(1)15) ylab(0(0.1)1.0, angle(horizontal))
    graph: "cumulative hazard curve" (attached)


    The total cumulative hazard is about 47%, whereas the total cumulative survival is about 63%. These don't add up to 100%. Am I missing something?

    Any thoughts greatly appreciated.

    With thanks,
    Tim


    Attached Files

  • #2
    Your understanding is incorrect. If \(\lambda(t)\) is the hazard function; \(\Lambda(t)\) is the cumulative, or integrated, hazard; and \(S(t)\) is the survival function, here are the relationships between them:
    \[
    \Lambda(t) = \int_0^t \negthinspace \lambda(x)dx
    \]
    \[
    \Lambda(t) = -\text{log}(S(t))
    \]
    \[
    S(t) = \text{exp}\left(- \int_0^t \negthinspace \Lambda(x)dx\right)
    \]
    Last edited by Steve Samuels; 01 Apr 2016, 08:00.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      It is the (cumulative) failure and survival functions which add to 1.

      \[
      S(t) = 1-F(t)
      \]
      Steve Samuels
      Statistical Consulting
      [email protected]

      Stata 14.2

      Comment


      • #4
        Thanks for the reply Steve Samuels.

        And yet the cumulative hazard lifetable is entirely consistent with the survival lifetable (hazard+survival=1), as follows:

        ltable intervalyrs recur, hazard

        Cum
        Interval Cum. Failure
        0 1 0.2804
        1 2 0.3086
        2 3 0.3258
        3 4 0.3384
        4 5 0.347
        5 6 0.3539
        6 7 0.359
        7 8 0.3632
        8 9 0.3669
        9 10 0.3699
        10 11 0.372
        11 12 0.3735
        12 13 0.3758

        ltable intervalyrs recur

        Cum
        Interval Cum. Survival
        0 1 0.7196
        1 2 0.6914
        2 3 0.6742
        3 4 0.6616
        4 5 0.653
        5 6 0.6461
        6 7 0.641
        7 8 0.6368
        8 9 0.6331
        9 10 0.6301
        10 11 0.628
        11 12 0.6265
        12 13 0.6242



        It seems very odd to me that the lifetables are consistent with each other, and that both lifetables are also consistent with the survival graph, but that the hazard graph is inconsistent with all of the others.

        Does anyone else get the same problem when using the "sts graph, cumhaz" command....?

        Comment


        • #5
          ltable is showing the estimated cumulative failure function F(t). You must have run
          Code:
          ltable intervalyrs recur, failure
          not
          Code:
          ltable intervalyrs recur, hazard
          But even this version would not list the cumulative hazard, but rather the interval-specific hazard rates.

          Please read FAQ 12 and present all commands, listings, and results between CODE delimiters, so that columns align correctly.
          Last edited by Steve Samuels; 01 Apr 2016, 09:54.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment


          • #6
            Except for the fact that both functions increase, the cumulative hazard is nothing like the failure curve. In fact the cumulative hazard can exceed 1.0, so is not a probability. Below are the estimated failure function and two different estimates of the cumulative hazard function. The first is the one Stata generates. It is the Nelson-Aalen estimate, shown on page 300 of the Manual Entry for sts. The N-A estimate is the finite sample version of the definition:
            \[
            \Lambda_1(t) = \int_0^t \lambda(t)dt
            \]
            A second estimate can be based on the mathematical relationship of the cumulative hazard function to the Survival curve
            \[
            \Lambda_2(t)= -\textrm{log}(1-S(t))
            \]
            where the Kaplan-Meier estimate \(\widehat{S}\) is substituted for \(S\). The two estimates are very close
            Code:
            webuse catheter, clear
            stset time infect
            sts gen cumhaz1 = na km = s
            label var cumhaz1 "Cum Haz:Nelson-Aalen"
            gen cumhaz2 = -log(km)
            label var cumhaz2 "Cum Haz:-log(s)"
            gen cumfail = 1 - km
            sort _t
            label var cumfail "Cumulative Failure Probability"
            #delim;
            twoway connect cumhaz1 cumhaz2 cumfail _t,
             c(stairstep stairstep)
             title("KM Failure & Two Cum Hazard Estimates")
             saving(g01, replace);
            #delim cr
            graph use g01
            graph export graph.png
            Click image for larger version

Name:	graph.png
Views:	1
Size:	53.6 KB
ID:	1333887

            Last edited by Steve Samuels; 03 Apr 2016, 11:26.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              Tim,

              You observed that \(\widehat{\Lambda}(t) \approx 1 - \widehat{S}(t) = \widehat{F}(t)\). The example above shows that this isn't true in general. However the approximation does hold for small values of \(F(t)\) and \(\Lambda(t)\) and that's all you have in your data set (\(F(t)<0.30\)). Here's a proof. Start with the mathematical formula:
              \[
              S(t) = e^{-\Lambda(t)}
              \]
              Now consider the exponential function \(e^{-x}\) for small values of \(x\). The first term of a Taylor's Series for \(e^{-x}\) expanded about zero is:
              \[
              e^{-x} \approx 1- x
              \]
              so that \(x \approx 1- e^{-x}\). Apply this to small values of \( x =\Lambda(t)\) and you get
              \[
              S(t ) \approx 1- \Lambda(t)
              \]
              or
              \[
              \Lambda(t) \approx 1 - S(t) = F(t)
              \]
              as you observed. Here's the graph of the Nelson-Aalen estimate and the estimated failure curve for the data above with \(F(t) < 0.3\). The curves do look similar and values are close early on.
              Click image for larger version

Name:	graph2.png
Views:	1
Size:	58.5 KB
ID:	1333922


              Last edited by Steve Samuels; 03 Apr 2016, 22:24.
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment


              • #8
                Dear Steve - thank you very much for your thoughts and knowledge on this.

                I will have to dig deeper to understand this more fully and will post back if I have anything further to contribute. Your input is hugely appreciated and helpful.

                Thanks again,
                Tim

                Comment


                • #9
                  Incidentally, I did run:

                  ltable intervalyrs recur, hazard

                  and not

                  ltable intervalyrs recur, failure

                  ...and, interestingly, the hazard lifetable doesn't show cumulative hazard (it only gives interval-specific hazards), but it does give cumulative failure, which was why I quoted the cumulative failure results from the ltable hazard command (which added to my confusion!).

                  It would surely be useful to see cumulative hazard in tabular (lifetable) format in a way that corresponds roughly to a cumulative hazard graph. At the moment I cannot find a way to generate this.

                  Comment


                  • #10
                    You can use sts list with the cumhaz option to see the cumulative hazard values.
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      sts list will display the cumulative hazard.
                      Code:
                      sts list,  survival failure cumhaz
                      Why was the cumulative hazard omitted from ltable? The cumulative hazard has never been a standard life tble measure; the values by themselves are not meaningful. When needed, they can be estimated by \(\textrm{log}(S(t))\)
                      I do find the cumulative hazard function useful:
                      1. For estimating the average hazard rate in an interval \([t_j, t_k]\).
                      \[
                      \textrm{average hazard} = \frac{\Lambda(t_k) - \Lambda(t_j)}{t_k - t_j}
                      \]
                      2. For checking distributional and model assumptions. For example one of the plots for checking a proportional hazards assumption (stcox postestimation) iis called the "log- log plot of survival". This is a plot of the log cumulative hazard.
                      Last edited by Steve Samuels; 04 Apr 2016, 09:23.
                      Steve Samuels
                      Statistical Consulting
                      [email protected]

                      Stata 14.2

                      Comment

                      Working...
                      X