Graphing curvilinear interaction effects on Cox proportional hazards models

Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#1

Graphing curvilinear interaction effects on Cox proportional hazards models

02 Oct 2016, 06:08

Dear members,

I am computing a curvilinear interaction effect on a Cox proportional hazards model:

stcox IV c.IV#c.IV MV c.IV#c.MV c.IV#c.IV#c.MV

I would like to graph such interaction effect. I usually compute margins and plot them; however, it does not make sense to do so in Cox proportional hazards model. Can you please give me some suggestions on how to overcome given issue?

Any help would be much appreciated.

Thanks,

Giuseppe
Tags: curvilinear, graph, interaction, stcox, survival analysis
Clyde Schechter

Join Date: Apr 2014

Posts: 28611
#2

02 Oct 2016, 10:09

it does not make sense to do so in Cox proportional hazards model.

Why not? I would think that plots of the hazard ratios at selected interesting combinations of the predictor variables would be quite useful.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#3

02 Oct 2016, 11:21

Hi Clyde,

thanks for your quick answer!

When I try to plot the following:

stcox IV c.IV#c.IV MV c.IV#c.MV c.IV#c.IV#c.MV

margins, at (IV=(1(1)5) MV=(0 1)) vsquish
marginsplot, noci recast(line) scheme(S1mono)

the graph goes off-scale on the y-axis. This seems to be a mistake. How would you recommend to plot the curvilinear interaction so that the values on the y-axis are within 0 and 1?

All the best,

Giuseppe
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 28611
#4

02 Oct 2016, 11:40

What do you mean the graph goes off scale? And why do you expect that the hazard ratios (which are what are plotted on the y-axis) would be restricted to be between 0 and 1? They can be any non-negative value.

I think in order to give you more concrete advice, you should post the -margins- output so we can see what you're talking about.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#5

02 Oct 2016, 11:45

Hi Clyde,

please find attached the margins output

All the best,

Giuseppe

Last edited by Giuseppe Criaco; 02 Oct 2016, 11:48.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 28611
#6

02 Oct 2016, 11:55

Well, the shapes of the curves look reasonable. I agree that hazard ratios in the range of 20000+ are not reasonable. But the problem is not likely to arise from the way the results are being graphed. I suspect it is the results of your Cox model that must be somehow wrong. I suggest you post the exact code you ran for the Cox model and for the -margins- command, along with the exact and complete output you got from both of those. (What you showed above is the output from -margnsplot-, which is helpful, but suggests the need for additional information about the earlier steps.)

I have occasionally seen results like this, though, where the failure event in the baseline condition (MV = 0 & IV = 0) is extremely rare.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#7

02 Oct 2016, 13:00

Hi Clyde,

please find attached the exact code I have run for the Cox model and for the -margins- command.

Thanks for all your help!

Giuseppe
Attached Files

Example Statalist.docx (15.4 KB, 1 view)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 28611
#8

02 Oct 2016, 13:28

Giuseppe: Please read the FAQ, esp. #12. The way to post code and output is to put it between code delimiters right here in the Forum editor.

Some of the most frequent responders on this forum do not use Microsoft Office products. I do, but because Office documents can contain active content, including malware, I don't download them from strangers.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#9

02 Oct 2016, 13:33

Hi Clyde,

sorry for the misunderstanding- Here you have it:

stcox CV1 CV2 i.CV3 i.CV4 CV5-CV9 IV MV c.IV#c.IV c.IV#c.MV c.IV#c.IV#c.MV, cluster(ID) strata (strata_variable)

failure _d: failed1 == 1
analysis time _t: (year0-origin)
origin: time origin_year_lagged
id: Fad_F_Id

Iteration 0: log pseudolikelihood = -133.59234
Iteration 1: log pseudolikelihood = -120.52875
Iteration 2: log pseudolikelihood = -120.12881
Iteration 3: log pseudolikelihood = -119.40349
Iteration 4: log pseudolikelihood = -119.40219
Iteration 5: log pseudolikelihood = -119.40219
Refining estimates:
Iteration 0: log pseudolikelihood = -119.40219

Stratified Cox regr. -- Breslow method for ties

No. of subjects = 288 Number of obs = 1,153
No. of failures = 160
Time at risk = 1153
Wald chi2(20) = 62.15
Log pseudolikelihood = -119.40219 Prob > chi2 = 0.0000

(Std. Err. adjusted for 288 clusters in ID)
--------------------------------------------------------------------------------
| Robust
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
CV1 | .6773376 .6466136 -0.41 0.683 .1042832 4.399428
CV2 | .799419 .3179738 -0.56 0.574 .3666097 1.743191
|
CV3 |
2004 | 1.883343 .7406755 1.61 0.107 .8713076 4.070872
2005 | 1.252482 .4623409 0.61 0.542 .6075183 2.582164
2006 | 1.367779 .5227753 0.82 0.413 .6466664 2.89302
2007 | .6190037 .3017687 -0.98 0.325 .2380819 1.609386
2008 | 1.49565 .8159533 0.74 0.461 .5134005 4.357163
2009 | 1 (omitted)
2010 | 1 (omitted)
|
CV4 |
2004 | 1.164307 .3514033 0.50 0.614 .6444119 2.10364
2005 | 1.601702 .5340156 1.41 0.158 .8332719 3.078766
2006 | 1.328332 .4661388 0.81 0.418 .6677325 2.642473
|
CV5 | 4.194037 1.539758 3.91 0.000 2.042351 8.612599
CV6 | 1.005075 .0060563 0.84 0.401 .9932745 1.017015
CV7 | .8820815 .0475719 -2.33 0.020 .7936011 .9804267
CV8 | 1.527727 .4494722 1.44 0.150 .8582501 2.719428
CV9 | .9672293 .019678 -1.64 0.101 .9294201 1.006577
IV | .9044792 .2286269 -0.40 0.691 .5511106 1.484426
MV | 1.656697 .9238018 0.91 0.365 .5553911 4.941823
|
c.IV#c.IV | 1.003382 .0265346 0.13 0.898 .9527001 1.056761
|
c.IV#c.MV | .5199431 .1723461 -1.97 0.048 .2715234 .9956449
|
c.IV#c.IV#c.MV | 1.070526 .0344607 2.12 0.034 1.005071 1.140244
--------------------------------------------------------------------------------
Stratified by strata_variable

.
. margins, at (IV=(1(1)10) MV=(0 1)) vsquish

Predictive margins Number of obs = 1,153
Model VCE : Robust

Expression : Relative hazard, predict()
1._at : IV = 1
MV = 0
2._at : IV = 1
MV = 1
3._at : IV = 2
MV = 0
4._at : IV = 2
MV = 1
5._at : IV = 3
MV = 0
6._at : IV = 3
MV = 1
7._at : IV = 4
MV = 0
8._at : IV = 4
MV = 1
9._at : IV = 5
MV = 0
10._at : IV = 5
MV = 1
11._at : IV = 6
MV = 0
12._at : IV = 6
MV = 1
13._at : IV = 7
MV = 0
14._at : IV = 7
MV = 1
15._at : IV = 8
MV = 0
16._at : IV = 8
MV = 1
17._at : IV = 9
MV = 0
18._at : IV = 9
MV = 1
19._at : IV = 10
MV = 0
20._at : IV = 10
MV = 1

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
1 | 99070.19 489149.1 0.20 0.839 -859644.4 1057785
2 | 91356.41 449907.4 0.20 0.839 -790445.9 973158.7
3 | 90519.23 449031.4 0.20 0.840 -789566.1 970604.6
4 | 53245.64 264264.9 0.20 0.840 -464704 571195.3
5 | 83266.74 414573.1 0.20 0.841 -729281.6 895815
6 | 35806.03 178928.4 0.20 0.841 -314887.1 386499.2
7 | 77114.33 384696.8 0.20 0.841 -676877.5 831106.1
8 | 27781.48 139564 0.20 0.842 -245758.9 301321.9
9 | 71900.43 358656.3 0.20 0.841 -631053 774853.8
10 | 24870.33 125357.9 0.20 0.843 -220826.7 270567.3
11 | 67493.32 335944.5 0.20 0.841 -590945.7 725932.4
12 | 25688.27 129637.9 0.20 0.843 -228397.3 279773.8
13 | 63785.64 316234.5 0.20 0.840 -556022.5 683593.8
14 | 30613.66 154352.1 0.20 0.843 -271910.9 333138.3
15 | 60690.1 299344.7 0.20 0.839 -526014.7 647394.9
16 | 42094.25 211628.2 0.20 0.842 -372689.3 456877.8
17 | 58136.07 285220.2 0.20 0.838 -500885.2 617157.3
18 | 66781.68 334255.1 0.20 0.842 -588346.3 721909.7
19 | 56066.87 273924.5 0.20 0.838 -480815.3 592949.1
20 | 122241.6 608530.8 0.20 0.841 -1070457 1314940
------------------------------------------------------------------------------
Warning: Multiple observations per subject are detected. Predictions that require averaging over the dataset may not be appropriate. Use the at() option to compute predictions at fixed
values of the covariates.

.
. marginsplot, noci recast(line) scheme(s1mono)

Variables that uniquely identify margins: IV MV

.
. log close

Giuseppe

Last edited by Giuseppe Criaco; 02 Oct 2016, 13:37.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 28611
#10

02 Oct 2016, 13:51

Well, I'm having a hard time understanding how the -margins- output came from that -stcox- output. If some of the continuous CV* variables take on very large values, then this might explain it. In fact it would only take a small number of observations having extremely large values on some covariate for this to happen. Have you looked at summary statistics on your continuous CV's? They might reveal a data problem.

But there are two things that catch my eye as red flags.

One is that -margins- itself is warning you that calculating averages like this "may not be" (read usually isn't) appropriate when you have multiple observations per subject and that it is better to specify fixed values for all of the other covariates. If you don't want to chose particular values for each, you might do that quickly by just using the -atmeans- option.

The other thing that strikes me as suspicious is:

Code:

No. of subjects = 288 Number of obs = 1,153 No. of failures = 160 Time at risk = 1153

In real life data, it would be a remarkable coincidence that the number of observations and the total time at risk are the same number. It is as if each participant/entity in your study were at risk for exactly 1 time period. I suppose that is possible, but it makes me wonder if you have properly -stset- your data.

I'm also a little puzzled that you are using one variable, ID, for your -vce(cluster)- option, but a different variable to define study IDs in your -stset- This, too, isn't necessarily wrong, but it's unusual.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#11

02 Oct 2016, 14:22

Dear Clyde,

thanks for your feedback. Some clarifications:

1) My DV is failure of new firms. So I am basically following new firms since inception. The time frame is 1 year, which means that failure could happen any year after founding. This explains why number of obs. and time at risk are the same number.

2) The ID variable for the vce(cluster) and stset is indeed the same.

I have tried to remove those CVs with extremely large values and hazard ratios look much more reasonable: many thanks for this hint!!!

All the best,

Giuseppe
Comment

Announcement

Graphing curvilinear interaction effects on Cox proportional hazards models

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment