flexible parametric models in stpm2 ; values of predicted hazard ratio

joel belmin

Join Date: Mar 2022

Posts: 2
#1

flexible parametric models in stpm2 ; values of predicted hazard ratio

13 Mar 2022, 10:49

Hi,
I am using Stata 16.1
I am analysing in a longitudinal study the associations between blood pressure (BP) level at baseline and mortality. I first have used a classic Cox model with a categorical BP variable (for systolic BP : ]90, [90-99[, ...[130-139[, ...).
I have obtained hazard ratios for each of the 7 values of this categorical variable). As expected, there are significant increases in HR for the lowest levels of BP and for the highest levels of BP (as compared to mid-range BP).
Now, I would like obtain a plot between systolic BP at baseline (as a continuous variable on x axis) and the predicted hazard ratio of mortality on y axis. Flexible parametric model seems to be the way to get it and I tried it with stpm2.

The continuous variable for BP is called PAS_moy, dureeOBSj if the time variable, DC2==1 is the living status (categorical)
my code is :
#
stset dureeOBSj, failure(DC2==1) scale(1)
stpm2 PAS_moy , scale(hazard) df(3) eform
predict hr_PS, hazard ci
/

When I plot with :
#
twoway (qfit hr_PS PAS_moy, lcolor(black) msymbol(point)) (qfit hr_PS_lci PAS_moy, lcolor(gray) lpattern(dash)) (qfit hr_PS_uci PAS_moy, lcolor(gray) lpattern(dash))
/

I obtain a nice graph with a U type curve as expected,.. But, ... the Y axis indicates "fiited values" ranging between 0.0008 and 0.0040, instead of hazard ratio estimate (in the Cox analysis they varied from 2.4 to 1.3, with the reference being the category 130-139).
I did not find how to get them on the flexible parametric model despite carefull reading of help and ressources. I suppose I miss something to define the predictied hr.
Will I get a little help from my friends ?

Joel
Tags: None
Paul Dickman

Join Date: Apr 2014

Posts: 294
#2

14 Mar 2022, 04:03

Below are some details of how to get the predicted HR. This is a strength of stpm2.

However, it's not clear what you expect. Your model contains only the main effect of BP. The HR will be constant for all values of BP and all values of time. As with the Cox model, the predicted hazard will vary over BP and time, but the HR will be constant. If the graph of the predicted HR is to be informative then you would need to include time-varying effects (the tvc() option) of an interaction between BP and a modifier.

This is where to find the help file.

Code:

help stpm2_postestimation

In the -predict- statement, hazard will give you the predicated hazard (as you noticed). To get the predicted HR you need to specify hrnumerator and hrdenominator.

Code:

hrdenominator(varname # ...) specifies the denominator of the hazard ratio. By default all covariates other than varname and any other variables mentioned are set to zero. See cautionary note in hrnumerator. If # is set to missing (.) then varname takes its observed values in the dataset. hrnumerator(varname # ...) predicts the (time-dependent) hazard ratio with the numerator of the hazard ratio. By default all covariates other than varname and any other variables mentioned are set to zero. Note that setting the remaining values of the covariates to zero may not always be sensible, particularly with models other than on the cumulative hazard scale, or when more than one variable has a time-dependet effect. If # is set to missing (.) then varname takes its observed values in the dataset.

Here's an example:
http://pauldickman.com/software/stat..._modification/
2 likes
Comment
joel belmin

Join Date: Mar 2022

Posts: 2
#3

15 Mar 2022, 14:58

Many thanks Paul for your post. I fixed hrdenominator and hrnumerator as you advised, and I have obtained the hazard ratio.
my code is now
#
stset dureeOBSj, failure(DC2==1) scale(1)
stpm2 PAS_moy , scale(hazard) df(3) eform
predict hr_PS, hrnumerator(PAS_moy .) hrdenominator(PAS_moy 130) ci
/
HR value is 1 for the values of PAS_moy at 130, and the curve is almost ready to be inserted in a manuscript.
Thanks again Paul. ;-D
Comment
Paul Dickman

Join Date: Apr 2014

Posts: 294
#4

16 Mar 2022, 02:38

Glad to help. Yes, the HR should be 1 at by definition.

However, I suggest you give more consideration to what you are plotting before you submit your manuscript. I don't have a full, understanding of what you are trying to achieve, or what you've done, but I suspect there may be concerns. In short, your code above assumes a loglinear effect of PAS_moy so your plotted HR will reflect that assumption.

Let's return to the code from your original post where you predicted and plotted the hazard. The following code estimates and plots the hazard at time 1.

Code:

stpm2 PAS_moy, scale(h) df(3) eform // predict hazard at time one generate one=1 predict h, hazard timevar(one) twoway line h PAS_moy, yscale(log)

This should be a straight line, because you have assumed a loglinear effect of PAS_moy on the outcome. You said your plot had a U shape. This could be because you plotted on a linear scale and forced a quadratic function by using -qfit-. That is, if you rerun my code above without yscale(log) you'll no longer see a straight line. Your U shape may also be due to you estimating the hazard at different values of time. By default, -predict- gives the predicted hazard at _t (the exit time). That's why I've used the timevar() option in the code above so that all estimates are at the same time.

Now we get to the problem. If you take the predicted hazard as a function of PAS_moy and divide it by a constant (the predicted hazard when PAS_moy=130) then the curve will have the exact same shape, it will just be rescaled. This is what you've done when predicting the HR in the code above. That is, the HR as a function of PAS_moy should be a loglinear function (i.e., a straight line) so is not especially informative.

If what you want is to study the functional form of the association then I suggest you relax the assumption of a linear effect of PAS_moy. That is, model PAS_moy as a spline.

Finally, please use CODE tags when posting code.
Comment

Martin Imelda Borg

Join Date: Jan 2022
Posts: 225

01 Dec 2023, 14:53

Hello there, in post 4 assumes that there are no covariates or TVC. I wanted to plot a hazard ratio as part of a cumulative variable: age

Code:

generate one=1
stpm2 age, scale(h) df(3) eform
predict h, hazard timevar(one)
twoway line h age, sort


//this works ---> but no spline introduced, no covariates introduced

However, I have agespl1 agespl2 generated from age. and male is a tvc.
I tried to set covariates at the mean level for male.
However I still have issues with regards to predicting - as I get two separate funny lines for agespl1 agespl2 rather than the smooth line produced from the code above

Code:

// generate spline variables for ageof diagnosis
rcsgen age, df(3) gen(agespl) orthog    ///ok  

// predict the HR at 1 year
generate t1=1

gen treatment2 = 0 
replace treatment2 = 1 if stage == 1 

gen modelmale = male 
gen modeltreatment2 = treatment2

stpm2 agespl* modelmale modeltreatment2, scale(h) df(5) eform ///
      tvc(agespl* modelmale) dftvc(2)

sum modelmale 
replace modelmale=0.47 //mean
sum modeltreatment2
replace modeltreatment2=0.87  //mean

predict h, hazard timevar(t1)
twoway line h agespl*, sort  //graph does not make sense

Comment

Paul Dickman

Join Date: Apr 2014

Posts: 294
#6

05 Dec 2023, 13:09

I don't understand what you are trying to do so am reluctant to give specific advice. A couple of quick general comments though.

1. You state you wish to estimate the hazard ratio, but your predict command estimates the hazard.
2. Unless you specify otherwise, predict will give predictions at observed values of covariates. Look at the help file and the examples on Paul Lambert's web page.
3. -stpm3- has an improved syntax for modelling and prediction
Comment

Announcement

flexible parametric models in stpm2 ; values of predicted hazard ratio

Comment

Comment

Comment

Comment

Comment