Singletons dropped for ppmlhdfe (Poisson regression) but not reghdfe (linear regression)

Jimmy Graham

Join Date: Feb 2020
Posts: 3

Singletons dropped for ppmlhdfe (Poisson regression) but not reghdfe (linear regression)

08 Aug 2023, 14:05

I am running a fixed effects regression with two time periods and 3,492 units. When I run a linear regression using reghdfe, I have no issues and no observations are dropped. When I run a Poisson regression with ppmlhdfe, Stata drops "6632 observations that are either singletons or separated by a fixed effect." My question is: why would Stata drop singletons for a Poisson fixed effects regression but not a linear fixed effects regression?

Also, it seems that most of the dropped observations in the Poisson regression are the units that experience no change in the dependent variable from period 1 to period 2. Why would Stata drop these observations? It seems to me that they still provide useful information.

Here is the code and output for reghdfe, which does not drop observations:

Code:

reghdfe event treat_any, absorb(cell_id) cluster(cell_id)

(MWFE estimator converged in 1 iterations)

HDFE Linear regression                            Number of obs   =      6,984
Absorbing 1 HDFE group                            F(   1,   3491) =      12.42
Statistics robust to heteroskedasticity           Prob > F        =     0.0004
                                                  R-squared       =     0.6370
                                                  Adj R-squared   =     0.2738
                                                  Within R-sq.    =     0.0070
Number of clusters (cell_id) =      3,492         Root MSE        =     0.3213

                            (Std. Err. adjusted for 3,492 clusters in cell_id)
------------------------------------------------------------------------------
             |               Robust
       event |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   treat_any |  -.0765679   .0217247    -3.52   0.000    -.1191623   -.0339735
       _cons |   .0614419   .0033764    18.20   0.000      .054822    .0680619
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     cell_id |      3492        3492           0    *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

Here is the code and output for ppmlhdfe, which does drop observations:

Code:

ppmlhdfe event treat_any, absorb(cell_id) cluster(cell_id)

(dropped 6632 observations that are either singletons or separated by a fixed effect)
Iteration 1:   deviance = 3.3482e+02  eps = .         iters = 1    tol = 1.0e-04  min(eta) =  -1.40  P   
Iteration 2:   deviance = 3.1968e+02  eps = 4.74e-02  iters = 1    tol = 1.0e-04  min(eta) =  -1.54      
Iteration 3:   deviance = 3.1938e+02  eps = 9.43e-04  iters = 1    tol = 1.0e-04  min(eta) =  -1.56      
Iteration 4:   deviance = 3.1938e+02  eps = 9.97e-07  iters = 1    tol = 1.0e-04  min(eta) =  -1.56      
Iteration 5:   deviance = 3.1938e+02  eps = 3.57e-12  iters = 1    tol = 1.0e-05  min(eta) =  -1.56   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 5 iterations and 5 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression                              No. of obs      =        352
Absorbing 1 HDFE group                            Residual df     =        175
Statistics robust to heteroskedasticity           Wald chi2(1)    =      15.97
Deviance             =  319.3822431               Prob > chi2     =     0.0001
Log pseudolikelihood = -392.1367457               Pseudo R2       =     0.2276

Number of clusters (cell_id)=        176
                              (Std. Err. adjusted for 176 clusters in cell_id)
------------------------------------------------------------------------------
             |               Robust
       event |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   treat_any |   -.894175   .2237562    -4.00   0.000    -1.332729    -.455621
       _cons |   .4578038   .0352768    12.98   0.000     .3886626    .5269451
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     cell_id |       176         176           0    *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

Tags: None

Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#2

08 Aug 2023, 15:32

Dear Jimmy Graham,

My guess is that the observations being dropped are mostly zeros that are perfectly predicted by fixed effects. The authors of the command have written about this issue and I suggest you check the relevant documents.

Best wishes,

Joao
1 like
Comment
Jimmy Graham

Join Date: Feb 2020

Posts: 3
#3

10 Aug 2023, 10:07

Thanks for your reply. I did some more digging and the issue is related to separation. It was not actually singletons that were dropped. As with logit and probit models, poisson fixed effect models can't handle groups with no variation on the DV. This may be the case with maximum likelihood models in general. See here for reference: https://github.com/sergiocorreia/ppm...tion_primer.md.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3028
#4

10 Aug 2023, 11:08

Dear Jimmy Graham,

I am well aware of the issue; see here and here.

The problem is not exactly as you describe it. In short, some observations contain no information on the parameters of interest and can, and should, be dropped.

Best wishes,

Joao
Comment

Announcement

Singletons dropped for ppmlhdfe (Poisson regression) but not reghdfe (linear regression)

Comment

Comment

Comment