Terminating observation within matched pairs of subjects in a matched cohort analysis

Harriet Forbes

Join Date: Aug 2024

Posts: 3
#1

Terminating observation within matched pairs of subjects in a matched cohort analysis

16 Aug 2024, 04:58

Hi there

I am carrying out a matched cohort study, where each exposed individual is individually matched (on age, sex GP practice and calendar time) to up to 10 unexposed individuals. The matched sets are identified by the variable setid. I am using a stratified cox model [i.e. stcox exposed, strata(setid)].

Does anybody know whether STATA terminates follow-up among members of a matched set once one member is no longer under observation? So does STATA count events among the unexposed if they occur AFTER the exposed case is censored?

I get identical results when I manually censor the unexposed individuals on the date the exposed individual gets censored (i.e. I change their exit date), compared to when I don't manually censor them. However, the number of events and person time the models say they are using is different. The output is below for both scenarios.

Thanks in advance for your help.

Best wishes
Harriet

-------------------------------------------------------------------------------------------------
With unexposed individuals manually censored when exposed is censored:

Failure _d: dementia==1
Analysis time _t: (doexit-origin)/365.25
Origin: time doentry
Enter on or after: time doentry
Exit on or before: time doexit
ID variable: id

Iteration 0: Log likelihood = -5123.2261
Iteration 1: Log likelihood = -5123.2157
Iteration 2: Log likelihood = -5123.2157
Refining estimates:
Iteration 0: Log likelihood = -5123.2157

Stratified Cox regression with no ties
Strata variable: setid

No. of subjects = 121,369 Number of obs = 121,369
No. of failures = 2,534
Time at risk = 464,617.632
LR chi2(1) = 0.02
Log likelihood = -5123.2157 Prob > chi2 = 0.8850

_t Haz. ratio Std. err. z P>z [95% conf. interval]

exposed 1.008399 .0582552 0.14 0.885 .900448 1.129293

-----------------------------------
Using all available data:

Failure _d: dementia==1
Analysis time _t: (doexit-origin)/365.25
Origin: time doentry
Enter on or after: time doentry
Exit on or before: time doexit
ID variable: id

Iteration 0: Log likelihood = -5123.2261
Iteration 1: Log likelihood = -5123.2157
Iteration 2: Log likelihood = -5123.2157
Refining estimates:
Iteration 0: Log likelihood = -5123.2157
Stratified Cox regression with no ties
Strata variable: setid
No. of subjects = 121,369 Number of obs = 121,369
No. of failures = 2,534
Time at risk = 464,617.632
LR chi2(1) = 0.02
Log likelihood = -5123.2157 Prob > chi2 = 0.8850

_t Haz. ratio Std. err. z P>z [95% conf. interval]

exposed 1.008399 .0582552 0.14 0.885 .900448 1.129293
Tags: stcox
Harriet Forbes

Join Date: Aug 2024

Posts: 3
#2

16 Aug 2024, 07:06

I posted the wrong output for the "all available data" analysis above

This is the correct output:

Failure _d: dementia==1
Analysis time _t: (doexit-origin)/365.25
Origin: time doentry
Enter on or after: time doentry
Exit on or before: time doexit
ID variable: id

Iteration 0: Log likelihood = -10896.591
Iteration 1: Log likelihood = -10896.58
Iteration 2: Log likelihood = -10896.58
Refining estimates:
Iteration 0: Log likelihood = -10896.58

Stratified Cox regression with Breslow method for ties
Strata variable: setid

No. of subjects = 121,369 Number of obs = 121,369
No. of failures = 6,349
Time at risk = 820,479.765
LR chi2(1) = 0.02
Log likelihood = -10896.58 Prob > chi2 = 0.8850

------------------------------------------------------------------------------
_t | Haz. ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
exposed | 1.008399 .0582552 0.14 0.885 .900448 1.129293
------------------------------------------------------------------------------
Comment
Andrea Discacciati

Join Date: Feb 2016

Posts: 194
#3

16 Aug 2024, 09:31

Does anybody know whether STATA terminates follow-up among members of a matched set once one member is no longer under observation?

No.

But look at how the partial likelihood is defined for a stratified Cox model: as the product of stratum-specific partial likelihoods https://web.njit.edu/~wguo/Math%2065...xt_Book%5D.pdf

Intuitively, after the only exposed observation for any given strata is removed from the risk set (because of an event or censoring), the remaining (all unexposed) observations in that stratum carry no further information on the _hazard ratio_ for exposed vs unexposed observations. So, you might as well censor them: it won't change the HR estimate (but note that this would affect, for example, the estimated survival functions -- see toy example below)

If you ignore the strata you'll get different HRs instead, of course (see toy example below).

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(sid x t d t2 d2) 1 1 5.5 1 5.5 1 1 0 1 1 1 1 1 0 2 1 2 1 1 0 3 1 3 1 1 0 4 1 4 1 1 0 5 1 5 1 1 0 6 1 5.5 0 1 0 7 1 5.5 0 1 0 8 1 5.5 0 1 0 9 1 5.5 0 2 1 2.5 1 2.5 1 2 0 1.2 1 1.2 1 2 0 2.2 1 2.2 1 2 0 3.2 1 2.5 0 2 0 4.2 1 2.5 0 2 0 5.2 1 2.5 0 2 0 6.2 1 2.5 0 2 0 7.2 1 2.5 0 2 0 8.2 1 2.5 0 2 0 9.2 1 2.5 0 end stset t, fail(d) stcox x, strata(sid) stcox x sts, by(sid) name(g1, replace) xlabel(0/10) stset t2, fail(d2) stcox x, strata(sid) stcox x sts, by(sid) name(g2, replace) xlabel(0/10)
1 like
Comment
Harriet Forbes

Join Date: Aug 2024

Posts: 3
#4

19 Aug 2024, 04:30

Thank you Andrea. This is very helpful. It seems to me the reported number of failures and person time at risk reported in the stratified cox model estimate is therefore wrong, unless you manually edit the the end dates. Would you agree?
Comment
Andrea Discacciati

Join Date: Feb 2016

Posts: 194
#5

20 Aug 2024, 05:50

I am not sure I agree. Stata is simply summarising whatever outcome data is in your dataset with the total number of events and the sum of the person-time at risk. As the outcome data changes (censoring the unexposed obs), those 2 descriptive statistics will change as well.
Because of the specific data structure (matched-cohort data with 1 exposed subject) and procedure (stratified PH Cox model), the HR estimate from the 2 versions of the outcome data is the same. But, again, those descriptive statistics are neither right or wrong, they simply follow from the data you decided to use.
1 like
Comment

Announcement

Terminating observation within matched pairs of subjects in a matched cohort analysis

Comment

Comment

Comment

Comment