Dear Statalisters, I would appreciate your kind opinion on the below issue.
I have around 1 million observations from a few thousand manufacturers. I am dealing with an outcome variable which is production quantity. It has a large portion of zeroes (20%).
When I run a ppmlhdfe and xtpoisson to predict this variable, I am given AIC figures in billions (!) and a high Pseudo R2. I have never come across AIC and BIC figures this high. When I rescale the outcome variable by dividing it by 1,000, AIC/BIC figures seem to become more “usual”. However, the outcome variable contains figures smaller than 1,000, so I am concerned that this transformation might affect its count nature.
On the other hand, when I run xtnbreg, I receive better (still in millions) AIC and BIC figures. However, xtnbreg has been sensitive to model specification and often doesn't converge when I add or remove predictors.
Do you think I should be concerned about using fe poisson to predict this variable due to the large AIC figures? Does overdispersion make xtnbreg a better choice? In any case, ΔAIC is in millions across models (always 6 zeros), so I am also not sure on how to present the results.
Thank you!
I have around 1 million observations from a few thousand manufacturers. I am dealing with an outcome variable which is production quantity. It has a large portion of zeroes (20%).
Code:
. sum dv1 Variable | Obs Mean Std. dev. Min Max -------------+--------------------------------------------------------- dv1 | 1,128,675 17211.52 223438.7 0 1.35e+07
Code:
. ppmlhdfe dv1 l1.dv1 l1.z_cv2 l1.z_cv3 l1.z_cv4 l1.z_cv5 ,a(eproducer year) cluster(eproducer) exp(lagcv1) d (dropped 51273 observations that are either singletons or separated by a fixed effect) Iteration 1: deviance = 1.2802e+10 eps = . iters = 6 tol = 1.0e-04 min(eta) = -9.11 P Iteration 2: deviance = 5.7570e+09 eps = 1.22e+00 iters = 6 tol = 1.0e-04 min(eta) = -9.70 Iteration 3: deviance = 4.3877e+09 eps = 3.12e-01 iters = 5 tol = 1.0e-04 min(eta) = -10.07 Iteration 4: deviance = 4.0823e+09 eps = 7.48e-02 iters = 5 tol = 1.0e-04 min(eta) = -10.59 Iteration 5: deviance = 4.0171e+09 eps = 1.63e-02 iters = 4 tol = 1.0e-04 min(eta) = -11.49 Iteration 6: deviance = 4.0040e+09 eps = 3.25e-03 iters = 3 tol = 1.0e-04 min(eta) = -12.30 Iteration 7: deviance = 4.0015e+09 eps = 6.25e-04 iters = 2 tol = 1.0e-04 min(eta) = -12.96 Iteration 8: deviance = 4.0011e+09 eps = 1.15e-04 iters = 2 tol = 1.0e-04 min(eta) = -13.96 Iteration 9: deviance = 4.0010e+09 eps = 2.10e-05 iters = 2 tol = 1.0e-04 min(eta) = -14.96 Iteration 10: deviance = 4.0010e+09 eps = 4.28e-06 iters = 2 tol = 1.0e-05 min(eta) = -15.94 Iteration 11: deviance = 4.0010e+09 eps = 9.03e-07 iters = 2 tol = 1.0e-06 min(eta) = -16.91 S Iteration 12: deviance = 4.0010e+09 eps = 1.88e-07 iters = 2 tol = 1.0e-07 min(eta) = -17.82 S Iteration 13: deviance = 4.0010e+09 eps = 3.81e-08 iters = 2 tol = 1.0e-07 min(eta) = -18.61 S Iteration 14: deviance = 4.0010e+09 eps = 6.80e-09 iters = 2 tol = 1.0e-08 min(eta) = -19.14 S O ------------------------------------------------------------------------------------------------------------ (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance) Converged in 14 iterations and 45 HDFE sub-iterations (tol = 1.0e-08) HDFE PPML regression No. of obs = 942,909 Absorbing 2 HDFE groups Residual df = 4,129 Statistics robust to heteroskedasticity Wald chi2(5) = 84.28 Deviance = 4000971161 Prob > chi2 = 0.0000 Log pseudolikelihood = -2003530151 Pseudo R2 = 0.9694 Number of clusters (eproducer)= 4,130 (Std. err. adjusted for 4,130 clusters in eproducer) ------------------------------------------------------------------------------ | Robust dv1 | Coefficient std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- dv1 | L1. | 1.70e-07 2.86e-08 5.95 0.000 1.14e-07 2.26e-07 | z_cv2 | L1. | .1590041 .2122519 0.75 0.454 -.257002 .5750101 | z_cv3 | L1. | -.0043667 .0082502 -0.53 0.597 -.0205368 .0118033 | z_cv4 | L1. | -.2516427 .1229716 -2.05 0.041 -.4926626 -.0106228 | z_cv5 | L1. | .0085431 .0072125 1.18 0.236 -.0055932 .0226794 | _cons | 5.650281 .1897047 29.78 0.000 5.278467 6.022095 ln(lagcv1) | 1 (exposure) ------------------------------------------------------------------------------ Absorbed degrees of freedom: -----------------------------------------------------+ Absorbed FE | Categories - Redundant = Num. Coefs | -------------+---------------------------------------| eproducer | 4130 4130 0 *| year | 46 0 46 | -----------------------------------------------------+ * = FE nested within cluster; treated as redundant for DoF computation . estat ic Akaike's information criterion and Bayesian information criterion ----------------------------------------------------------------------------- Model | N ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- . | 942,909 -6.55e+10 -2.00e+09 6 4.01e+09 4.01e+09 ----------------------------------------------------------------------------- Note: BIC uses N = number of observations. See [R] BIC note.
Code:
. xtnbreg dv1 l1.dv1 l1.z_cv2 l1.z_cv3 l1.z_cv4 l1.z_cv5 i.year ,fe exp(lagcv1) note: 135 groups (135 obs) dropped because of only one obs per group note: 629 groups (49921 obs) dropped because of all zero outcomes Iteration 0: log likelihood = -1.225e+09 (not concave) Iteration 1: log likelihood = -4.990e+08 Iteration 2: log likelihood = -8027179.1 Iteration 3: log likelihood = -7995697.9 (not concave) Iteration 4: log likelihood = -7517186.9 Iteration 5: log likelihood = -7057044.2 (backed up) Iteration 6: log likelihood = -6798324.5 Iteration 7: log likelihood = -6734128 Iteration 8: log likelihood = -6504148.3 Iteration 9: log likelihood = -6355948.2 Iteration 10: log likelihood = -6330436.9 Iteration 11: log likelihood = -6328037 Iteration 12: log likelihood = -6327554.1 Iteration 13: log likelihood = -6327459.7 Iteration 14: log likelihood = -6327439.4 Iteration 15: log likelihood = -6327434.6 Iteration 16: log likelihood = -6327433.5 Iteration 17: log likelihood = -6327433.2 Iteration 18: log likelihood = -6327433.2 Iteration 19: log likelihood = -6327433.2 Iteration 20: log likelihood = -6327433.2 Iteration 21: log likelihood = -6327433.2 Iteration 22: log likelihood = -6327433.2 Iteration 23: log likelihood = -6327433.2 Conditional FE negative binomial regression Number of obs = 944,126 Group variable: eproducer Number of groups = 4,130 Obs per group: min = 2 avg = 228.6 max = 553 Wald chi2(51) = 1033911.20 Log likelihood = -6327433.2 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ dv1 | Coefficient Std. err. z P>|z| [95% conf. interval] -------------+---------------------------------------------------------------- dv1 | L1. | 1.11e-07 5.15e-10 214.94 0.000 1.10e-07 1.12e-07 | z_cv2 | L1. | -.6988149 .0015773 -443.05 0.000 -.7019063 -.6957235 | z_cv3 | L1. | -.0279359 .0004649 -60.09 0.000 -.0288471 -.0270246 | z_cv4 | L1. | .4380268 .0156258 28.03 0.000 .4074009 .4686528 | z_cv5 | L1. | .016793 .0038206 4.40 0.000 .0093048 .0242811 | year | | _cons | -1.826474 .0291059 -62.75 0.000 -1.88352 -1.769427 ln(lagcv1) | 1 (exposure) ------------------------------------------------------------------------------ . estat ic Akaike's information criterion and Bayesian information criterion ----------------------------------------------------------------------------- Model | N ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- . | 944,126 . -6327433 52 1.27e+07 1.27e+07 ----------------------------------------------------------------------------- Note: BIC uses N = number of observations. See [R] BIC note.
Thank you!
Comment