Dear all,
I got a dataset like this. And I would like to use Poisson Regression model for the dependent variable dq_m, independent variables include ttb_q, tmin_q, tmax_q, amtb_q, dofhw_q, qhw
After using Poisson regression model, I found out that it's better if I use Negative binominal regression model for the dataset because of overdispersion. However, the final model which I figured out show me unexpected result. As you can see from the graph attached, the model does not fit well to observed values. What should I do in order to improve the current model and find the best one.
Furthermore, as a result of limitations of secondary data, results in the dataset are so different to the medical articles. And I would like to know whether I use the wrong modelling process, wrong command or it's simply poor dataset.
Thank you all in advance!
Another issue that no.of heart attack cases I got here is quarterly data. I did not find any articles that using quarterly data for analysing impact of heatwave events on heart diseases! Do you have any idea about that? I really appreciate all of your advice!
I got a dataset like this. And I would like to use Poisson Regression model for the dependent variable dq_m, independent variables include ttb_q, tmin_q, tmax_q, amtb_q, dofhw_q, qhw
After using Poisson regression model, I found out that it's better if I use Negative binominal regression model for the dataset because of overdispersion. However, the final model which I figured out show me unexpected result. As you can see from the graph attached, the model does not fit well to observed values. What should I do in order to improve the current model and find the best one.
Furthermore, as a result of limitations of secondary data, results in the dataset are so different to the medical articles. And I would like to know whether I use the wrong modelling process, wrong command or it's simply poor dataset.
Thank you all in advance!
Code:
. clear . use Quarterly_Dofhw_Env_Analysis . sum l Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ttb_q | 36 28.27778 .7403131 27.3 30.3 tmin_q | 36 25.28056 .8454312 24.1 27.5 tmax_q | 36 33.57778 .8414648 32.2 35.8 amtb_q | 36 75.14444 4.501202 65.3 82.8 rtsh_q | 36 9.438889 4.036496 2 19.1 -------------+-------------------------------------------------------- hmax_pa_q | 36 113.2194 12.41303 92.6 137.7 dq_m | 36 2847.472 2073.42 765 11156 dq_c | 36 91.38889 61.60718 4 310 quarter | 36 201.5 10.53565 184 219 id | 36 18.5 10.53565 1 36 -------------+-------------------------------------------------------- dofhw_q | 36 2.222222 5.037636 0 22 qhw | 36 .25 .439155 0 1 gr box dq_m,by(qhw). poisson dq_m ttb_q tmin_q tmax_q amtb_q dofhw_q qhw, nolog Poisson regression Number of obs = 36 LR chi2(6) = 18257.00 Prob > chi2 = 0.0000 Log likelihood = -12276.162 Pseudo R2 = 0.4265 ------------------------------------------------------------------------------ dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttb_q | 1.163322 .0260642 44.63 0.000 1.112237 1.214407 tmin_q | -.641617 .0199307 -32.19 0.000 -.6806803 -.6025536 tmax_q | -.3136857 .020455 -15.34 0.000 -.3537768 -.2735945 amtb_q | .1041063 .0017078 60.96 0.000 .1007591 .1074534 dofhw_q | .0151342 .0014126 10.71 0.000 .0123656 .0179027 qhw | -.5604745 .0162171 -34.56 0.000 -.5922594 -.5286896 _cons | -6.012431 .3063855 -19.62 0.000 -6.612936 -5.411927 ------------------------------------------------------------------------------ . nbreg dq_m ttb_q tmin_q tmax_q amtb_q dofhw_q qhw, nolog Negative binomial regression Number of obs = 36 LR chi2(6) = 26.50 Dispersion = mean Prob > chi2 = 0.0002 Log likelihood = -301.9212 Pseudo R2 = 0.0420 ------------------------------------------------------------------------------ dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ttb_q | 1.023928 .6093453 1.68 0.093 -.1703668 2.218223 tmin_q | -.410953 .5268968 -0.78 0.435 -1.443652 .6217458 tmax_q | -.4743927 .5131714 -0.92 0.355 -1.48019 .5314047 amtb_q | .0933756 .0412971 2.26 0.024 .0124348 .1743164 dofhw_q | .0136807 .0287798 0.48 0.635 -.0427266 .0700881 qhw | -.3632879 .3167615 -1.15 0.251 -.9841291 .2575533 _cons | -1.741708 7.948207 -0.22 0.827 -17.31991 13.83649 -------------+---------------------------------------------------------------- /lnalpha | -1.630928 .2288978 -2.07956 -1.182297 -------------+---------------------------------------------------------------- alpha | .1957478 .0448062 .1249852 .3065737 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 2.4e+04 Prob>=chibar2 = 0.000 . nbreg dq_m qhw dofhw_q amtb_q,nolog Negative binomial regression Number of obs = 36 LR chi2(3) = 23.61 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -303.36761 Pseudo R2 = 0.0375 ------------------------------------------------------------------------------ dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- qhw | -.4479989 .3051038 -1.47 0.142 -1.045991 .1499936 dofhw_q | .0200182 .0246346 0.81 0.416 -.0282647 .0683012 amtb_q | .0914506 .0217634 4.20 0.000 .0487951 .1341062 _cons | 1.054483 1.667006 0.63 0.527 -2.212788 4.321755 -------------+---------------------------------------------------------------- /lnalpha | -1.555352 .2282957 -2.002803 -1.107901 -------------+---------------------------------------------------------------- alpha | .211115 .0481967 .1349564 .3302515 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 2.6e+04 Prob>=chibar2 = 0.000 . listcoef,percent nbreg (N=36): Percentage Change in Expected Count Observed SD: 2073.4199 ---------------------------------------------------------------------- dq_m | b z P>|z| % %StdX SDofX -------------+-------------------------------------------------------- qhw | -0.44800 -1.468 0.142 -36.1 -17.9 0.4392 dofhw_q | 0.02002 0.813 0.416 2.0 10.6 5.0376 amtb_q | 0.09145 4.202 0.000 9.6 50.9 4.5012 -------------+-------------------------------------------------------- ln alpha | -1.55535 alpha | 0.21112 SE(alpha) = 0.04820 ---------------------------------------------------------------------- LR test of alpha=0: 2.6e+04 Prob>=LRX2 = 0.000 ---------------------------------------------------------------------- . nbreg dq_m qhw amtb_q,nolog Negative binomial regression Number of obs = 36 LR chi2(2) = 22.93 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -303.70531 Pseudo R2 = 0.0364 ------------------------------------------------------------------------------ dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- qhw | -.2612912 .2143484 -1.22 0.223 -.6814064 .158824 amtb_q | .0915887 .0220894 4.15 0.000 .0482943 .1348831 _cons | 1.043923 1.69191 0.62 0.537 -2.272159 4.360005 -------------+---------------------------------------------------------------- /lnalpha | -1.537799 .2281601 -1.984985 -1.090614 -------------+---------------------------------------------------------------- alpha | .2148534 .049021 .1373827 .3360102 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 2.7e+04 Prob>=chibar2 = 0.000 . listcoef,percent nbreg (N=36): Percentage Change in Expected Count Observed SD: 2073.4199 ---------------------------------------------------------------------- dq_m | b z P>|z| % %StdX SDofX -------------+-------------------------------------------------------- qhw | -0.26129 -1.219 0.223 -23.0 -10.8 0.4392 amtb_q | 0.09159 4.146 0.000 9.6 51.0 4.5012 -------------+-------------------------------------------------------- ln alpha | -1.53780 alpha | 0.21485 SE(alpha) = 0.04902 ---------------------------------------------------------------------- LR test of alpha=0: 2.7e+04 Prob>=LRX2 = 0.000 ---------------------------------------------------------------------- . nbreg dq_m amtb_q,nolog Negative binomial regression Number of obs = 36 LR chi2(1) = 21.51 Dispersion = mean Prob > chi2 = 0.0000 Log likelihood = -304.4172 Pseudo R2 = 0.0341 ------------------------------------------------------------------------------ dq_m | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- amtb_q | .1060044 .0189324 5.60 0.000 .0688975 .1431113 _cons | -.1003189 1.42492 -0.07 0.944 -2.893111 2.692473 -------------+---------------------------------------------------------------- /lnalpha | -1.500788 .2278508 -1.947367 -1.054209 -------------+---------------------------------------------------------------- alpha | .2229544 .0508003 .1426491 .3484681 ------------------------------------------------------------------------------ Likelihood-ratio test of alpha=0: chibar2(01) = 2.8e+04 Prob>=chibar2 = 0.000 . listcoef,percent nbreg (N=36): Percentage Change in Expected Count Observed SD: 2073.4199 ---------------------------------------------------------------------- dq_m | b z P>|z| % %StdX SDofX -------------+-------------------------------------------------------- amtb_q | 0.10600 5.599 0.000 11.2 61.1 4.5012 -------------+-------------------------------------------------------- ln alpha | -1.50079 alpha | 0.22295 SE(alpha) = 0.05080 ---------------------------------------------------------------------- LR test of alpha=0: 2.8e+04 Prob>=LRX2 = 0.000 ---------------------------------------------------------------------- . predict amtb (option n assumed; predicted number of events) . tw (scatter dq_m quarter)|| line amtb quarter, legend(order(1 "Observed" 2 "NBRM_amtb"))Furthermore, No. cases of Heart Attack during heatwaves is lower than non-heatwave periods
Comment