Prediction after regress using spline terms

AnnaMaria Nannavecchia

Join Date: Jul 2017

Posts: 4
#1

Prediction after regress using spline terms

26 Jul 2017, 05:57

Dear Users,

in order to estimate a time mortality trend I applied a regression where log mortality rate is the dependent variable and calendar year is the indipendent one;
in my dataset, available calendar years span from 1980 to 2013. Calendar years have been fitted using spline terms:

gen ln_tsd2=ln(tsd2)
rcsgen year, gen(rcs) orthog df(3)
regress ln_tsd2 rcs*

After that, predicted log rates have been obtained by:

predict pred

I'd like to obtain predicted rate ratios (and CI) comparing each year predicted rate to a reference year (e.g. 1997) rate.

My question:
Is there a way to obtain these year rate ratios by using "predictnl"?

Thanks for your attention,
best wishes

Anna Maria
Tags: None

Joseph Coveney

Join Date: Apr 2014
Posts: 4410

26 Jul 2017, 18:45

Yes, you can obtain them with predictnl. The illustration shows the basic mechanics of an approach that gives you what you're looking for. Start at the "Begin here" comment; the first part of the do-file just creates a fictional dataset for illustration.

In the illustration, I used glm with a logarithmic link function instead of taking the logarithm of the mortality values and using regress as you did. Also, I used the official Stata command mkspline to generate restricted cubic splines, while you used a user-written command. If you want to pursue your approach, then you will need to adapt the code in the illustration with these differences in mind.

Code:

version 15.0

clear *
quietly set obs 34
set seed 1403850
generate int year = 1979 + _n
generate double mortality = exp(rnormal())

*
* Begin here
*
mkspline rcs = year, cubic nknots(4)

list year in 18, noobs
scalar define rcs1 = rcs1[18]
scalar define rcs2 = rcs2[18]
scalar define rcs3 = rcs3[18]

glm mortality c.rcs?, family(gaussian) link(log) nolog
predict double mortality_hat, mu

local expression exp(predict(xb) - _b[_cons] - _b[rcs1] * scalar(rcs1) - ///
    _b[rcs2] * scalar(rcs2) - _b[rcs3] * scalar(rcs3))

margins , over(year) expression(`expression')

predictnl double rate_ratio = `expression', ci(lb ub)

format mortality mortality_hat rate_ratio lb ub %03.2f

list year mortality mortality_hat rate_ratio lb ub, noobs separator(0) abbreviate(20)

exit

Comment

Andrea Discacciati

Join Date: Feb 2016
Posts: 194

27 Jul 2017, 01:56

This is an alternative to Joseph's excellent answer. Unlike Joseph, here I obtain the CIs for the RR by transforming the endpoints of the CIs on the log(RR) scale.

Code:

// Run Joseph's code first

predictnl logrr = _b[rcs1]*(rcs1-scalar(rcs1)) + _b[rcs2]*(rcs2-scalar(rcs2)) ///
    + _b[rcs3]*(rcs3-scalar(rcs3)), se(logrr_se)
gen rr = exp(logrr)
gen rr_lb = exp(logrr - 1.96*logrr_se)
gen rr_ub = exp(logrr + 1.96*logrr_se)

format rr rr_lb rr_ub %03.2f

list year mortality mortality_hat rate_ratio lb ub rr rr_lb rr_ub, noobs separator(0) abbreviate(20)

line rr rr_lb rr_ub year,  lp(l _ _)  legend(off) scheme(s2mono)

Last edited by Andrea Discacciati; 27 Jul 2017, 02:03.

Comment

AnnaMaria Nannavecchia

Join Date: Jul 2017

Posts: 4
#4

01 Aug 2017, 16:32

Dear Joseph and Andrea, thank you both for your precious advises.
Anna Maria
Comment

Announcement