Linear individual fixed effect estimators - equivalence?

Hisab Shagird

Join Date: Oct 2023

Posts: 11
#1

Linear individual fixed effect estimators - equivalence?

18 Oct 2023, 18:24

I understand that using "xtreg, fe"; "areg, absorb(unit)"; and "reg i.unit" are equivalent estimators and should result in the same estimate.

I was experimenting with these three estimators for a model with only one unit-invariant variable and expecting a null result for all three estimators. To my surprise, I find that reg with unit dummies actually estimates a coefficient.

To make it more concrete, here is the code:

Code:

use http://www.stata-press.com/data/r15/nlswork.dta, clear generate white = race == 1 keep if idcode < 600 // reduce the number of units to facilitate reg estimation xtset idcode xtreg ln_wage white, fe areg ln_wage white , absorb(idcode ) reg ln_wage white i.idcode

Why does the last regression estimate a coefficient and standard error for white but not the other two estimators?

Last edited by Hisab Shagird; 18 Oct 2023, 18:26.
Tags: areg, dummy variables, fixed effects, linear models, xtreg

Hisab Shagird

Join Date: Oct 2023
Posts: 11

18 Oct 2023, 18:29

Here is the result with the estimates for dummy variables omitted for the last regression:

Code:

---------------------------------------------------
                      (1)          (2)          (3)
                  ln_wage      ln_wage      ln_wage
                   b/se/p       b/se/p       b/se/p
---------------------------------------------------
white                   0            0    -.4464315
                        .            .     .2112086
                        .            .      .034632
---------------------------------------------------
N_g                   541                          
N                    3253         3253         3253
---------------------------------------------------

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30163

18 Oct 2023, 20:27

Look more carefully:

Code:

. reg ln_wage white i.idcode
note: 598.idcode omitted because of collinearity.

      Source |       SS           df       MS      Number of obs   =     3,253
-------------+----------------------------------   F(540, 2712)    =      6.68
       Model |  385.907807       540  .714644087   Prob > F        =    0.0000
    Residual |  290.351401     2,712  .107061726   R-squared       =    0.5707
-------------+----------------------------------   Adj R-squared   =    0.4852
       Total |  676.259208     3,252  .207951786   Root MSE        =     .3272

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       white |  -.4464315   .2112086    -2.11   0.035    -.8605775   -.0322856
             |

... [emphasis added]

The underlying phenomenon in all of these analyses is that there is a colinearity among the i.idcode variables and the white variable. In order to identify the model, that colinearity must be broken. It can be broken by imposing any one linear constraint among these variables. -areg- and -xtreg, fe- did it by removing the white variable. (Removing a variable is equivalent to imposing the constraint that its coefficient is zero.) As it happens, -regress- chose to do it instead by removing one of the i.idcode variables, namely 598.idcode. It makes no difference. It results in the same model with a different parameterization. If you run -predict- on all three models you will find that the results are the same (perhaps except for trivial rounding errors as the computations are performed differently.) Which emphasizes the most important point: when you have such a colinearity, none of the coefficients of the involved variables can be validly understood to represent an effect of that variable. All of those coefficients are artifacts of the particular way that the colinearity was broken. (In fact, you can prove mathematically that for any pre-selected value of a given coefficient, it is always possible to find a colinearity-breaking constraint that will result in that coefficient having that value. All of these coefficients are entirely meaningless!)

Now, Stata does not directly give you control over which variables get omitted to break colinearity. But indirectly you do have control in that for most, if not all, such situations Stata removes the variable(s) mentioned last in the command's varlist. Observe:

Code:

. reg ln_wage i.idcode white
note: white omitted because of collinearity.

      Source |       SS           df       MS      Number of obs   =     3,253
-------------+----------------------------------   F(540, 2712)    =      6.68
       Model |  385.907807       540  .714644087   Prob > F        =    0.0000
    Residual |  290.351401     2,712  .107061726   R-squared       =    0.5707
-------------+----------------------------------   Adj R-squared   =    0.4852
       Total |  676.259208     3,252  .207951786   Root MSE        =     .3272

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      idcode |
          2  |  -.3523788     .13358    -2.64   0.008    -.6143078   -.0904499
          3  |  -.4556787   .1267251    -3.60   0.000    -.7041663   -.2071911
          4  |   -.045801   .1365822    -0.34   0.737    -.3136167    .2220147
          5  |  -.1918034   .1365822    -1.40   0.160    -.4596191    .0760123
          6  |  -.2502497   .1267251    -1.97   0.048    -.4987373   -.0017621
          7  |  -.6753346    .149347    -4.52   0.000      -.96818   -.3824891
          9  |   .0231601    .130986     0.18   0.860    -.2336823    .2800025
         10  |  -.7139727   .1442829    -4.95   0.000    -.9968883   -.4310571
         12  |   .8825671   .2499053     3.53   0.000     .3925429    1.372591
         13  |   .2013828   .1365822     1.47   0.140    -.0664329    .4691985
         14  |   .0548292   .2499053     0.22   0.826    -.4351949    .5448534
         15  |   .4466563    .149347     2.99   0.003     .1538108    .7395017
         16  |   .1003248    .130986     0.77   0.444    -.1565176    .3571672
         17  |   .1422147   .1636014     0.87   0.385    -.1785814    .4630108
         18  |  -.5251419   .1889107    -2.78   0.005    -.8955654   -.1547185
         19  |  -.0573663     .13358    -0.43   0.668    -.3192953    .2045626
         20  |   .0461612    .130986     0.35   0.725    -.2106812    .3030036
...
... [redacted for brevity]
...
        580  |  -.3807083   .2499053    -1.52   0.128    -.8707325    .1093158
        582  |   .5842159   .2112086     2.77   0.006     .1700699    .9983619
        584  |  -.0020959   .1442829    -0.01   0.988    -.2850115    .2808197
        585  |  -.0182903     .13358    -0.14   0.891    -.2802192    .2436386
        586  |  -.1395406   .1442829    -0.97   0.334    -.4224562     .143375
        587  |   .9384173   .1636014     5.74   0.000     .6176212    1.259213
        588  |   .1404795   .2499053     0.56   0.574    -.3495447    .6305036
        589  |  -.2656911   .1442829    -1.84   0.066    -.5486067    .0172245
        590  |  -.5892505   .1889107    -3.12   0.002    -.9596739    -.218827
        591  |  -.0619256   .1442829    -0.43   0.668    -.3448412    .2209899
        593  |  -.6407163   .3405636    -1.88   0.060    -1.308507     .027074
        594  |  -.7191776   .3405636    -2.11   0.035    -1.386968   -.0513873
        595  |  -.3248108     .13358    -2.43   0.015    -.5867398   -.0628819
        596  |  -.1260314   .1365822    -0.92   0.356    -.3938471    .1417843
        597  |  -.0203266    .174167    -0.12   0.907    -.3618401     .321187
        598  |  -.4464315   .2112086    -2.11   0.035    -.8605775   -.0322856
        599  |  -.2415512   .1442829    -1.67   0.094    -.5244667    .0413644
             |
       white |          0  (omitted)
       _cons |   2.040434   .0944553    21.60   0.000     1.855222    2.225645
------------------------------------------------------------------------------

Comment

Hisab Shagird

Join Date: Oct 2023
Posts: 11

19 Oct 2023, 13:19

Clyde Schechter makes sense, many thanks.

Experimenting with non-linear count models like so:

Code:

use http://www.stata-press.com/data/r15/nlswork.dta, clear
generate white = race == 1
keep if idcode < 600
xtset idcode


bysort idcode: generate numObs = _N
drop if numObs == 1

//eststo nl1: xtpoisson wks_work white, fe
eststo nl2: poisson wks_work i.idcode white 
eststo nl3: xtnbreg wks_work white, fe
eststo nl4: nbreg wks_work i.idcode white 


esttab nl*, keep(white) cells(b se p) stats(N_g N)

xtpoisson, poisson, nbreg cannot estimate a coefficient for white (as expected). However, xtnbreg does. Not sure what is happening here.

Outputs are:

Code:

. eststo nl1: xtpoisson wks_work white, fe
note: white dropped because it is constant within group
independent variables required with fixed-effects model
r(102);

. esttab nl*, keep(white) cells(b se p) stats(N_g N)

---------------------------------------------------
                      (1)          (2)          (3)
                 wks_work     wks_work     wks_work
                   b/se/p       b/se/p       b/se/p
---------------------------------------------------
wks_work                                           
white                   0     .1037082            0
                        .     .0840918            .
                        .     .2174734            .
---------------------------------------------------
N_g                                473             
N                    3121         3121         3121
---------------------------------------------------

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#5

19 Oct 2023, 13:35

-xtnbreg, fe- is not a fixed-effects model in the usual sense. The fixed-effects apply only to the dispersion parameter, not the full model. That is why -xtnbreg, fe- provides an estimate for white: there are no "fixed effects" in that part of the model, and the part of the model that does involve the fixed effects, the dispersion parameter, has nothing to do with the white variable.
1 like
Comment
Hisab Shagird

Join Date: Oct 2023

Posts: 11
#6

19 Oct 2023, 13:47

Clyde Schechter, many thanks. I assume that Stata manual would be the best place to get more details on what you have explained here. I will look over there.
Comment

Announcement