Dear all,
I am using the ppmlhdfe command by Sergio Correia, Paulo Guimarães, Thomas Zylkin.
The goal is estimating a Poisson model with many levels of fixed effects (i.e. 4 categorical variables some of which are also interacted) that fails to converge using the conventional Poisson command, or even glm .. family(Poisson).
The ppmlhdfe command works well in the sense that i) it converges and ii) it is very fast. It does so by dropping singletons/separated observations.
In the specifications where the Poisson command also converged, the point estimates are identical.
Next, I am trying to do some out of sample prediction, which the command does not allow for, so must be done manually by adding the estimated fixed effects.
Here I am having some trouble understanding the output instead.
Example with only one binary FE and no other covariate:
Code:
sysuse auto.dta, clear ppmlhdfe price, absorb(foreign, savefe) d(sumFE)
- absorb(..., savefe) save all fixed effect estimates with __hdfe as prefix
- d(newvar) save sum of fixed effects as newvar; mandatory if running predict afterwards (except for predict,xb)
Code:
. ppmlhdfe price, absorb(foreign, savefe) d(sumFE) Iteration 1: deviance = 8.7262e+04 eps = . iters = 1 tol = 1.0e-04 min(eta) = 0.75 PS Iteration 2: deviance = 8.6958e+04 eps = 3.50e-03 iters = 1 tol = 1.0e-04 min(eta) = 0.72 S Iteration 3: deviance = 8.6958e+04 eps = 6.06e-07 iters = 1 tol = 1.0e-04 min(eta) = 0.72 S Iteration 4: deviance = 8.6958e+04 eps = 2.13e-14 iters = 1 tol = 1.0e-05 min(eta) = 0.72 S O ------------------------------------------------------------------------------------------------------------ (legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance) Converged in 4 iterations and 4 HDFE sub-iterations (tol = 1.0e-08) HDFE PPML regression No. of obs = 74 Absorbing 1 HDFE group Residual df = 72 Wald chi2(0) = . Deviance = 86958.07836 Prob > chi2 = . Log pseudolikelihood = -43866.7452 Pseudo R2 = 0.0028 ------------------------------------------------------------------------------ | Robust price | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 8.726951 .0555475 157.11 0.000 8.61808 8.835822 ------------------------------------------------------------------------------ Absorbed degrees of freedom: -----------------------------------------------------+ Absorbed FE | Categories - Redundant = Num. Coefs | -------------+---------------------------------------| foreign | 2 0 2 | -----------------------------------------------------+ . tab __hdfe1__ [FE] | 1.foreign | Freq. Percent Cum. ------------+----------------------------------- -.0154382 | 52 70.27 70.27 .0347057 | 22 29.73 100.00 ------------+----------------------------------- Total | 74 100.00 . tab sumFE Sum of | fixed | effects | Freq. Percent Cum. ------------+----------------------------------- -.0160663 | 1 1.35 1.35 -.0159765 | 1 1.35 2.70 -.0159186 | 1 1.35 4.05 -.0159104 | 1 1.35 5.41 -.0157847 | 1 1.35 6.76 -.0157775 | 1 1.35 8.11 -.0157128 | 1 1.35 9.46 -.0157128 | 1 1.35 10.81 -.0156133 | 1 1.35 12.16 -.0155503 | 1 1.35 13.51 -.0154646 | 1 1.35 14.86 -.0154554 | 1 1.35 16.22 -.0154529 | 1 1.35 17.57 -.0154441 | 1 1.35 18.92 -.0154263 | 1 1.35 20.27 -.0154207 | 1 1.35 21.62 -.01542 | 1 1.35 22.97 -.0154147 | 1 1.35 24.32 -.0153939 | 1 1.35 25.68 -.0153839 | 1 1.35 27.03 -.0153818 | 1 1.35 28.38 -.0153807 | 1 1.35 29.73 -.0153764 | 1 1.35 31.08 -.0153655 | 1 1.35 32.43 -.0153627 | 1 1.35 33.78 -.015358 | 1 1.35 35.14 -.0153537 | 1 1.35 36.49 -.0153527 | 1 1.35 37.84 -.015352 | 1 1.35 39.19 -.0153472 | 1 1.35 40.54 -.0153388 | 1 1.35 41.89 -.015338 | 1 1.35 43.24 -.0153366 | 1 1.35 44.59 -.0153348 | 1 1.35 45.95 -.015333 | 1 1.35 47.30 -.0153329 | 1 1.35 48.65 -.0153307 | 1 1.35 50.00 -.0153183 | 1 1.35 51.35 -.0153178 | 1 1.35 52.70 -.0153174 | 1 1.35 54.05 -.0153168 | 1 1.35 55.41 -.0153122 | 1 1.35 56.76 -.0153111 | 1 1.35 58.11 -.0153097 | 1 1.35 59.46 -.0153065 | 1 1.35 60.81 -.0153048 | 1 1.35 62.16 -.015303 | 1 1.35 63.51 -.0152949 | 1 1.35 64.86 -.015293 | 1 1.35 66.22 -.0152846 | 1 1.35 67.57 -.0152611 | 1 1.35 68.92 -.0152606 | 1 1.35 70.27 .0345068 | 1 1.35 71.62 .0345368 | 1 1.35 72.97 .0346048 | 1 1.35 74.32 .0346062 | 1 1.35 75.68 .0346532 | 1 1.35 77.03 .0346829 | 1 1.35 78.38 .0346917 | 1 1.35 79.73 .0347084 | 1 1.35 81.08 .0347104 | 1 1.35 82.43 .0347203 | 1 1.35 83.78 .0347233 | 1 1.35 85.14 .0347257 | 1 1.35 86.49 .0347354 | 1 1.35 87.84 .034745 | 1 1.35 89.19 .0347565 | 1 1.35 90.54 .0347597 | 1 1.35 91.89 .0347624 | 1 1.35 93.24 .0347686 | 1 1.35 94.59 .0347776 | 1 1.35 95.95 .0347806 | 1 1.35 97.30 .0347836 | 1 1.35 98.65 .0347851 | 1 1.35 100.00 ------------+----------------------------------- Total | 74 100.00
- Despite foreign being a binary variable, there is an estimate for each of its two values as well as an estimate for the constant. How are the values determined?
- Despite only one FE being used, __hdfe1__ and sumFE are not the same
- sumFE seems to be different for every observation
- when comparing against a standard Poisson command, the predicted values are (slightly) different, using either estimate of the FE. Note that in this simple example ppmlhdfe does not drop any observation.
Code:
predict yhat_ppmlhdfe gen yhat_ppmlhdfe_manual = exp(_b[_cons] + __hdfe1__) poisson price i.foreign predict yhat_poisson . su yhat* Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- yhat_ppmlh~e | 74 6165.257 143.7014 6068.61 6385.188 yhat_ppmlh~l | 74 6165.257 143.6977 6072.423 6384.682 yhat_poisson | 74 6165.257 143.6979 6072.423 6384.682 . . compare yhat_ppmlhdfe yhat_poisson ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ yhat_pp~e<yhat_po~n 21 -3.812361 -1.270896 -.0355738 yhat_pp~e>yhat_po~n 53 .0171664 .5039798 1.079185 ---------- jointly defined 74 -3.812361 .0002987 1.079185 ---------- total 74 . compare yhat_ppmlhdfe_manual yhat_poisson ---------- difference ---------- count minimum average maximum ------------------------------------------------------------------------ yhat_pp~l=yhat_po~n 22 yhat_pp~l>yhat_po~n 52 .0004883 .0004883 .0004883 ---------- jointly defined 74 0 .0003431 .0004883 ---------- total 74
I might contact the authors directly but thought it would be best to share this first.