Dear all,
I got a dataset like this. And I would like to use Poisson Regression model for the dependent variable dq_m, independent variables include ttb_q, tmin_q, tmax_q, amtb_q, dofhw_q, qhw
After using Poisson regression model, I found out that it's better if I use Negative binominal regression model for the dataset because of overdispersion. However, the final model which I figured out show me unexpected result. As you can see from the graph attached, the model does not fit well to observed values. What should I do in order to improve the current model and find the best one.
Furthermore, as a result of limitations of secondary data, results in the dataset are so different to the medical articles. And I would like to know whether I use the wrong modelling process, wrong command or it's simply poor dataset.
Thank you all in advance!
Another issue that no.of heart attack cases I got here is quarterly data. I did not find any articles that using quarterly data for analysing impact of heatwave events on heart diseases! Do you have any idea about that? I really appreciate all of your advice!
I got a dataset like this. And I would like to use Poisson Regression model for the dependent variable dq_m, independent variables include ttb_q, tmin_q, tmax_q, amtb_q, dofhw_q, qhw
After using Poisson regression model, I found out that it's better if I use Negative binominal regression model for the dataset because of overdispersion. However, the final model which I figured out show me unexpected result. As you can see from the graph attached, the model does not fit well to observed values. What should I do in order to improve the current model and find the best one.
Furthermore, as a result of limitations of secondary data, results in the dataset are so different to the medical articles. And I would like to know whether I use the wrong modelling process, wrong command or it's simply poor dataset.
Thank you all in advance!
Code:
. clear
. use Quarterly_Dofhw_Env_Analysis
. sum
l
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
ttb_q | 36 28.27778 .7403131 27.3 30.3
tmin_q | 36 25.28056 .8454312 24.1 27.5
tmax_q | 36 33.57778 .8414648 32.2 35.8
amtb_q | 36 75.14444 4.501202 65.3 82.8
rtsh_q | 36 9.438889 4.036496 2 19.1
-------------+--------------------------------------------------------
hmax_pa_q | 36 113.2194 12.41303 92.6 137.7
dq_m | 36 2847.472 2073.42 765 11156
dq_c | 36 91.38889 61.60718 4 310
quarter | 36 201.5 10.53565 184 219
id | 36 18.5 10.53565 1 36
-------------+--------------------------------------------------------
dofhw_q | 36 2.222222 5.037636 0 22
qhw | 36 .25 .439155 0 1
gr box dq_m,by(qhw)
. poisson dq_m ttb_q tmin_q tmax_q amtb_q dofhw_q qhw, nolog
Poisson regression Number of obs = 36
LR chi2(6) = 18257.00
Prob > chi2 = 0.0000
Log likelihood = -12276.162 Pseudo R2 = 0.4265
------------------------------------------------------------------------------
dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ttb_q | 1.163322 .0260642 44.63 0.000 1.112237 1.214407
tmin_q | -.641617 .0199307 -32.19 0.000 -.6806803 -.6025536
tmax_q | -.3136857 .020455 -15.34 0.000 -.3537768 -.2735945
amtb_q | .1041063 .0017078 60.96 0.000 .1007591 .1074534
dofhw_q | .0151342 .0014126 10.71 0.000 .0123656 .0179027
qhw | -.5604745 .0162171 -34.56 0.000 -.5922594 -.5286896
_cons | -6.012431 .3063855 -19.62 0.000 -6.612936 -5.411927
------------------------------------------------------------------------------
. nbreg dq_m ttb_q tmin_q tmax_q amtb_q dofhw_q qhw, nolog
Negative binomial regression Number of obs = 36
LR chi2(6) = 26.50
Dispersion = mean Prob > chi2 = 0.0002
Log likelihood = -301.9212 Pseudo R2 = 0.0420
------------------------------------------------------------------------------
dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ttb_q | 1.023928 .6093453 1.68 0.093 -.1703668 2.218223
tmin_q | -.410953 .5268968 -0.78 0.435 -1.443652 .6217458
tmax_q | -.4743927 .5131714 -0.92 0.355 -1.48019 .5314047
amtb_q | .0933756 .0412971 2.26 0.024 .0124348 .1743164
dofhw_q | .0136807 .0287798 0.48 0.635 -.0427266 .0700881
qhw | -.3632879 .3167615 -1.15 0.251 -.9841291 .2575533
_cons | -1.741708 7.948207 -0.22 0.827 -17.31991 13.83649
-------------+----------------------------------------------------------------
/lnalpha | -1.630928 .2288978 -2.07956 -1.182297
-------------+----------------------------------------------------------------
alpha | .1957478 .0448062 .1249852 .3065737
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 2.4e+04 Prob>=chibar2 = 0.000
. nbreg dq_m qhw dofhw_q amtb_q,nolog
Negative binomial regression Number of obs = 36
LR chi2(3) = 23.61
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -303.36761 Pseudo R2 = 0.0375
------------------------------------------------------------------------------
dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
qhw | -.4479989 .3051038 -1.47 0.142 -1.045991 .1499936
dofhw_q | .0200182 .0246346 0.81 0.416 -.0282647 .0683012
amtb_q | .0914506 .0217634 4.20 0.000 .0487951 .1341062
_cons | 1.054483 1.667006 0.63 0.527 -2.212788 4.321755
-------------+----------------------------------------------------------------
/lnalpha | -1.555352 .2282957 -2.002803 -1.107901
-------------+----------------------------------------------------------------
alpha | .211115 .0481967 .1349564 .3302515
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 2.6e+04 Prob>=chibar2 = 0.000
. listcoef,percent
nbreg (N=36): Percentage Change in Expected Count
Observed SD: 2073.4199
----------------------------------------------------------------------
dq_m | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
qhw | -0.44800 -1.468 0.142 -36.1 -17.9 0.4392
dofhw_q | 0.02002 0.813 0.416 2.0 10.6 5.0376
amtb_q | 0.09145 4.202 0.000 9.6 50.9 4.5012
-------------+--------------------------------------------------------
ln alpha | -1.55535
alpha | 0.21112 SE(alpha) = 0.04820
----------------------------------------------------------------------
LR test of alpha=0: 2.6e+04 Prob>=LRX2 = 0.000
----------------------------------------------------------------------
. nbreg dq_m qhw amtb_q,nolog
Negative binomial regression Number of obs = 36
LR chi2(2) = 22.93
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -303.70531 Pseudo R2 = 0.0364
------------------------------------------------------------------------------
dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
qhw | -.2612912 .2143484 -1.22 0.223 -.6814064 .158824
amtb_q | .0915887 .0220894 4.15 0.000 .0482943 .1348831
_cons | 1.043923 1.69191 0.62 0.537 -2.272159 4.360005
-------------+----------------------------------------------------------------
/lnalpha | -1.537799 .2281601 -1.984985 -1.090614
-------------+----------------------------------------------------------------
alpha | .2148534 .049021 .1373827 .3360102
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 2.7e+04 Prob>=chibar2 = 0.000
. listcoef,percent
nbreg (N=36): Percentage Change in Expected Count
Observed SD: 2073.4199
----------------------------------------------------------------------
dq_m | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
qhw | -0.26129 -1.219 0.223 -23.0 -10.8 0.4392
amtb_q | 0.09159 4.146 0.000 9.6 51.0 4.5012
-------------+--------------------------------------------------------
ln alpha | -1.53780
alpha | 0.21485 SE(alpha) = 0.04902
----------------------------------------------------------------------
LR test of alpha=0: 2.7e+04 Prob>=LRX2 = 0.000
----------------------------------------------------------------------
. nbreg dq_m amtb_q,nolog
Negative binomial regression Number of obs = 36
LR chi2(1) = 21.51
Dispersion = mean Prob > chi2 = 0.0000
Log likelihood = -304.4172 Pseudo R2 = 0.0341
------------------------------------------------------------------------------
dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
amtb_q | .1060044 .0189324 5.60 0.000 .0688975 .1431113
_cons | -.1003189 1.42492 -0.07 0.944 -2.893111 2.692473
-------------+----------------------------------------------------------------
/lnalpha | -1.500788 .2278508 -1.947367 -1.054209
-------------+----------------------------------------------------------------
alpha | .2229544 .0508003 .1426491 .3484681
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 2.8e+04 Prob>=chibar2 = 0.000
. listcoef,percent
nbreg (N=36): Percentage Change in Expected Count
Observed SD: 2073.4199
----------------------------------------------------------------------
dq_m | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
amtb_q | 0.10600 5.599 0.000 11.2 61.1 4.5012
-------------+--------------------------------------------------------
ln alpha | -1.50079
alpha | 0.22295 SE(alpha) = 0.05080
----------------------------------------------------------------------
LR test of alpha=0: 2.8e+04 Prob>=LRX2 = 0.000
----------------------------------------------------------------------
. predict amtb
(option n assumed; predicted number of events)
. tw (scatter dq_m quarter)|| line amtb quarter, legend(order(1 "Observed" 2 "NBRM_amtb"))
Furthermore, No. cases of Heart Attack during heatwaves is lower than non-heatwave periods


Comment