Test Graph Export

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#1

Test Graph Export

03 Apr 2016, 10:04

Except for the fact that both functions increase, cumulative hazard estimate is nothing like the estimate\( \widehat{F} = 1- S\). In fact the cumulative hazard estimate can exceed 1.0.

Below I plot the estimated failure function and two different estimates of the cumulative hazard function. The first is the one Stata generates. It is the Nelson-Aalen estimate, shown on page 300 of the Manual Entry for sts. The N-A estimate the finite sample version of the definition:
\[
\Lambda_1(t) = \int_0^t \lambda(t)dt
\]
A second estimate can be based on the mathematical relationship of the cumulative hazard function to the Survival curve
\[
\Lambda_2(t)= -\textrm{log}(1-S(t))
\]
where the Kaplan-Meier estimate \(\widehat{S}\) is substituted for \(S\). The graph below shows that the two estimates are very close.

Code:

webuse catheter, clear stset time infect sts gen cumhaz1 = na km = s label var cumhaz1 "Cum Haz:Nelson-Aalen" gen cumhaz2 = -log(km) label var cumhaz2 "Cum Haz:-log(s)" gen cumfail = 1 - km plot cumhaz1 cumhaz2 cumfail _t sort _t label var cumfail "Cumulative Failure Probability" #delim; twoway connect cumhaz1 cumhaz2 cumfail _t, c(stairstep stairstep) title("KM Failure & Two Cum Hazard Estimates") saving(g01, replace); graph use g01 graph export graph.png

Last edited by Steve Samuels; 03 Apr 2016, 10:42.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

17 Jul 2016, 17:27

You've asked two versions of this question, the other being here, but the details change in each. In this post the population size is 100,000; in the other it is 10,000 and 20,000. I don't believe that any of these numbers is the real one. In future posts, please do not use fake numbers; they just confuse the issue.

In this post, you say that you collect all cases of deaths in 10 clusters, but you don't say that they are same ones that were selected with PPS; I'll assume that they were.

Here is how to compute the weights:

1. Let \(N\) be the population size and \(N_j\) be the size the j-th selected cluster. Then the probability of selection the j-th cluster is \(f1 = N_j/N\).
2. In each sampled cluster, the probability that (the record of) a dead person is selected for study is \(f2 = 1\). The probability of selecting (the record of?) a living person for study is \(f2 = 300/N_j\).

The overall probability of selecting a person is f (f = f1 \times f2\). For a live person \(f = (N_j/N)\times 300/N_j = 300/N\); for a dead person the probability of selection is \(f = N_j/N \times 1= N_j/N\). The design weight is \(W = 1/f).

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
cetin ser

Join Date: Nov 2015

Posts: 6
#3

19 Jul 2016, 05:33

My data ( recurrent exacerbation chronic obstructive pulmonary disease) attachment, Which is frailty model (shared, joint,.......) please help me how can get a stata code?
thanks..

time :COPD exacerbation recurrent time
status :failure/
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

19 Jul 2016, 13:39

This is a forum for making test posts. Ask on the General Forum; before you do read the FAQs and follow the directions in FAQ 12

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

19 Jul 2016, 13:40

Correction: I now understand where the "10" in your equation came from. For estimating totals, the correct factor to use in the first stage is indeed:

\[
f_{1j} = n \frac{N_j}{N}
\]
or with n = 10
\[
f_{1j} = 10 \frac{N_j}{N}
\]

This is the probability that cluster j will be selected by a proper PPS sampling algorithm. If, however, \(\frac{N_j}{N}>1/10\) for any cluster, you would have been forced to treat that cluster as a certainty unit and restart the algorithm with the reduced n.

One thing unclear from your description is whether deaths were included in the population \(N_j\) and \(N\) totals. If so, then for living people

\[
f_{2j} = \frac{300}{N_j - D_j}
\]
where \(D_j\) is the number of deaths in cluster j.

Then form the design weight

\[
W_j = (1/f_{1j})(1/f_{2j})
\]
as before.

The design weights as I specified them would be okay for estimating statistics other than totals. I apologize for the error.

Last edited by Steve Samuels; 19 Jul 2016, 13:42.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

Comment

Comment

Comment

Comment