Calculate Annual Percentage Change using joinplot/nl hockey

Josna Rani

Join Date: Feb 2019

Posts: 20
#1

Calculate Annual Percentage Change using joinplot/nl hockey

09 May 2023, 23:59

Hello,
I have data on years and corresponding cancer cases.

year cases
2010 7.07143
2011 7.09581
2012 7.12348
2013 7.15027
2014 7.17207
2015 7.1936
2016 7.21309
2017 7.23115
2018 7.24734
2019 7.26185
2020 7.27493
2021 7.28676
2022 7.29735
2023 7.30616
2024 7.3131
2025 7.31808
2026 7.32111
2027 7.32244
2028 7.32209
2029 7.31986
2030 7.3155
2031 7.30892
2032 7.30047
2033 7.29041
2034 7.27849
2035 7.26443
2036 7.24815
2037 7.22982
2038 7.20979
2039 7.18799
2040 7.16415
2041 7.13816
2042 7.11005
2043 7.07997
2044 7.04794
2045 7.01377
2046 6.97755
2047 6.93938
2048 6.89907
2049 6.85672
2050 6.81231

I want to calculate the annual percentage change over the 4o years. The recommend way to do that is to identify joints or segments of varying slopes and then use the APC formula APC_i = { exp(b_i) - 1 } x 100, where bi as the slope coefficient for the ith segment with i indexing the segments in the desired range of years.

#My first question is what command can I use to identify the number of segments in the data? That way I can use piecewise linear regression.

I have read in other posts about the nl hockey program. It identifies 2 segments only. I have tried using it. The result looks like this:

HTML Code:

. nl hockey cases year (obs = 41) Iteration 0: residual SS = 1.19e+10 Iteration 1: residual SS = .0401542 Iteration 2: residual SS = .0333771 Iteration 3: residual SS = .0330889 Iteration 4: residual SS = .0330671 Source | SS df MS Number of obs = 41 -------------+------------------------------ F( 3, 37) = 275.09 Model | .737541997 3 .245847332 Prob > F = 0.0000 Residual | .033067058 37 .000893704 R-squared = 0.9571 -------------+------------------------------ Adj R-squared = 0.9536 Total | .770609056 40 .019265226 Root MSE = .0298949 Res. dev. = -175.6814 (hockey) ------------------------------------------------------------------------------ cccases | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- breakpoint | 2030.589 .4860736 4177.53 0.000 2029.604 2031.574 slope_l | .0123611 .0010773 11.47 0.000 .0101782 .014544 slope_r | -.0260939 .0011593 -22.51 0.000 -.0284428 -.0237449 cons | -17.72384 2.17623 -8.14 0.000 -22.1333 -13.31438 ------------------------------------------------------------------------------ * Parameter cons taken as constant term in model & ANOVA table (SEs, P values, CIs, and correlations are asymptotic approximations)

HTML Code:

predict cases_hat (option yhat assumed; fitted values) graph twoway scatter ccces_hat year || line cases_hat year

#I now want to use the coefficients slope_l and slope_r to calculate APC for the entire range. So the formula would be {exp(range of years segment1*slope1 coefficient+range of years segemnt2*slope2 coefficient)/total years-1}*100. How can I implement that? I am not sure how to save the slope coefficients and calculate the range of years in each segment without manually looking them up from results and graphs.

Thanks
Josna
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3456
#2

10 May 2023, 01:51

The first thing you should always do is look at the data. The first thing to notice is that this is not observed or empirical data (2050 hasn't happened yet). So we are dealing with a projection, which means that there is already an underlying model. Ideally, you already have that model and the coefficients and you can derive the growth rate from that. If that is not possible, then you can plot that projection

Code:

clear input year cases 2010 7.07143 2011 7.09581 2012 7.12348 2013 7.15027 2014 7.17207 2015 7.1936 2016 7.21309 2017 7.23115 2018 7.24734 2019 7.26185 2020 7.27493 2021 7.28676 2022 7.29735 2023 7.30616 2024 7.3131 2025 7.31808 2026 7.32111 2027 7.32244 2028 7.32209 2029 7.31986 2030 7.3155 2031 7.30892 2032 7.30047 2033 7.29041 2034 7.27849 2035 7.26443 2036 7.24815 2037 7.22982 2038 7.20979 2039 7.18799 2040 7.16415 2041 7.13816 2042 7.11005 2043 7.07997 2044 7.04794 2045 7.01377 2046 6.97755 2047 6.93938 2048 6.89907 2049 6.85672 2050 6.81231 end twoway line cases year

This screams polynomial (probably quadratic?) to me (and makes me extremely suspicious about the validity of that projection, but that is another story). So we can try to recover that model with its parameters.

Code:

//with quadratics it is often easier to first center the variable gen yearc = year - 2030 poisson cases c.yearc##c.yearc, vce(robust) predict mu1 twoway scatter cases year || /// line mu1 year, /// lpattern(solid) legend(off)

Close, but not quite. It is however close enough to suspect that this projection is indeed based on a polynomial. So lets add a cube term.

Code:

poisson cases c.yearc##c.yearc##c.yearc, vce(robust) predict mu2 twoway scatter cases year || /// line mu2 year, /// lpattern(solid) legend(off)

I think we found our model. Now the growth rate. (Again I am really really really suspicious about the validity of this projection) The formula for the growth rate you gave is not true in general. In general you want the first derivative with respect to year. For a polynomial that is not too hard, but you can also just use margins

Code:

margins, dydx(yearc) over(year) marginsplot, plotopts(msymbol(i)) /// recastci(rarea) ciopts(astyle(ci))

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
3 likes
Comment
Josna Rani

Join Date: Feb 2019

Posts: 20
#3

10 May 2023, 23:48

Thanks Maarten for such a detailed response.
You're right. It is a projection of cancer cases based on age specific rates and population projection. Since the age specific rate is assumed to be constant, the case projection is largely affected by the projected population growth. The population size decreases after 2030. I take your concern onboard and will double check my calculations.
The formula for growth rate I shared is generally used for age standardized incidence rate. I wanted to try it out with number of cases as I need to calculate the annual percentage change for this.
Thanks again for your help!
Comment

Announcement

Calculate Annual Percentage Change using joinplot/nl hockey

Comment

Comment