Estimate risk for groups

Guest
#1

Estimate risk for groups

19 Jan 2019, 11:13

Ciao Here is the example using Stata data.

Code:

use http://www.stata-press.com/data/r13/drugtr stset studytime, failure(died) stcox drug age

Now if we want to get HR risk for subjects less than 56 or more than 56 we make new variable

Code:

gen lessthan56 = age recode lessthan56 (min/55 = 0) recode lessthan56 (56/max = 1)

But how is it possible to get HR risk for these two groups?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#2

19 Jan 2019, 11:22

Your code for creating lessthan56 is far more complicated than it needs to be. See -help fvvarlist- to learn about factor variable notation. Then read https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, from the excellent Richard Williams, for a crystal clear introduction to the amazing -margins- command.

Code:

gen byte lessthan56 = (age < 56) if !missing(age) stcox drug i.lessthan56 margins lessthan56

All of that said, unless there is a clear theoretical reason to believe that the hazard ratio changes abruptly at age 56, dichotomizing age in this way is a bad idea. It discards information (it treats a 55 year old as radically different from a 56 year old but that same 55 year old is the same as a 5 year old) and may introduce bias into the analysis. Categorizing inherently continuous variables can be helpful for displaying descriptive statistics, but such categorizations are usually unsuitable for analysis.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17730
#3

19 Jan 2019, 11:26

Reading Clyde's as usual excellent reply, the following reference comes to my mind: https://www.ncbi.nlm.nih.gov/pubmed/16217841.

Kind regards,
Carlo
(Stata 19.0)
Comment
Guest
#4

19 Jan 2019, 12:26

Clyde Schechter Carlo Lazzaro Thanks a bunch. I am actually just using this to explore analysis in Stata and thank you very much for your helpful comments. Margin command outputs the HR risk for both groups is that true? I see the Margin value is close to the HR in the stcox model however the confidence intervals vary greatly. Is that supposed to be that way? Lastly is it acceptable to run the stcox model for each group in separate models?

Last edited by sladmin; 17 Oct 2019, 08:17. Reason: anonymize original poster
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#5

19 Jan 2019, 12:37

Your model contains variables other than age, and the output from -margins- gives you hazard ratios that are adjusted to the overall sample distribution of the other variables. That is why they differ somewhat from the direct output of -stcox-. The standard errors are also calculated in different ways.

Lastly is it acceptable to run the stcox model for each group in separate models?

Acceptable for what purpose?
Comment
Guest
#6

19 Jan 2019, 12:47

Clyde Schechter Thanks a bunch. Is that an acceptable way to get the HR risk for both groups?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#7

19 Jan 2019, 13:38

If you were to run -stcox drug i.age_group-, with just two age groups, you will get one hazard ratio, not two, for age, because one category becomes the reference, and the hazard ratio you get will give you the ratio of the hazards of one age group relative to the other. So there really aren't going to be two hazard ratios to compare if you do this (except in the rather trivial sense that one of them is arbitrarily constrained to 1.)

If you run -stcox drug i.age_group if age_group == 0- and again for 1, you will get no results at all for age_group, because in each of these models, age_group will be a constant, and so will be omitted from the model due to colinearity with the constant term. So, again, nothing to compare.

If you run -stcox drug i.age_group- followed by -margins age_group- you will get two hazard ratios. What do they mean? Well, these hazard ratios will be adjusted for the distribution of drug, which may well differ among the age groups (especially if this is observational data.) That is, they represent the average hazard ratios, relative to a base state where all of the model variables are 0, among those in each of those two age groups, and with the distribution of drug (or any other model variables) being standardized to that of the overall distribution of drug in the entire sample. The ratio of those two hazard ratios will be similar to the single hazard ratio for age_group shown in the Cox regression output if the distribution of drug (and any other variables) is nearly the same in both age groups.
Comment

Announcement

Estimate risk for groups

Comment

Comment

Comment

Comment

Comment

Comment