Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linear individual fixed effect estimators - equivalence?

    I understand that using "xtreg, fe"; "areg, absorb(unit)"; and "reg i.unit" are equivalent estimators and should result in the same estimate.

    I was experimenting with these three estimators for a model with only one unit-invariant variable and expecting a null result for all three estimators. To my surprise, I find that reg with unit dummies actually estimates a coefficient.

    To make it more concrete, here is the code:

    Code:
    use http://www.stata-press.com/data/r15/nlswork.dta, clear
    generate white = race == 1
    keep if idcode < 600    // reduce the number of units to facilitate reg estimation
    xtset idcode
    xtreg ln_wage white, fe
    areg ln_wage white , absorb(idcode )
    reg ln_wage white i.idcode

    Why does the last regression estimate a coefficient and standard error for white but not the other two estimators?
    Last edited by Hisab Shagird; 18 Oct 2023, 18:26.

  • #2
    Here is the result with the estimates for dummy variables omitted for the last regression:

    Code:
    ---------------------------------------------------
                          (1)          (2)          (3)
                      ln_wage      ln_wage      ln_wage
                       b/se/p       b/se/p       b/se/p
    ---------------------------------------------------
    white                   0            0    -.4464315
                            .            .     .2112086
                            .            .      .034632
    ---------------------------------------------------
    N_g                   541                          
    N                    3253         3253         3253
    ---------------------------------------------------

    Comment


    • #3
      Look more carefully:
      Code:
      . reg ln_wage white i.idcode
      note: 598.idcode omitted because of collinearity.
      
            Source |       SS           df       MS      Number of obs   =     3,253
      -------------+----------------------------------   F(540, 2712)    =      6.68
             Model |  385.907807       540  .714644087   Prob > F        =    0.0000
          Residual |  290.351401     2,712  .107061726   R-squared       =    0.5707
      -------------+----------------------------------   Adj R-squared   =    0.4852
             Total |  676.259208     3,252  .207951786   Root MSE        =     .3272
      
      ------------------------------------------------------------------------------
           ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
             white |  -.4464315   .2112086    -2.11   0.035    -.8605775   -.0322856
                   |
      
      ... [emphasis added]
      The underlying phenomenon in all of these analyses is that there is a colinearity among the i.idcode variables and the white variable. In order to identify the model, that colinearity must be broken. It can be broken by imposing any one linear constraint among these variables. -areg- and -xtreg, fe- did it by removing the white variable. (Removing a variable is equivalent to imposing the constraint that its coefficient is zero.) As it happens, -regress- chose to do it instead by removing one of the i.idcode variables, namely 598.idcode. It makes no difference. It results in the same model with a different parameterization. If you run -predict- on all three models you will find that the results are the same (perhaps except for trivial rounding errors as the computations are performed differently.) Which emphasizes the most important point: when you have such a colinearity, none of the coefficients of the involved variables can be validly understood to represent an effect of that variable. All of those coefficients are artifacts of the particular way that the colinearity was broken. (In fact, you can prove mathematically that for any pre-selected value of a given coefficient, it is always possible to find a colinearity-breaking constraint that will result in that coefficient having that value. All of these coefficients are entirely meaningless!)

      Now, Stata does not directly give you control over which variables get omitted to break colinearity. But indirectly you do have control in that for most, if not all, such situations Stata removes the variable(s) mentioned last in the command's varlist. Observe:
      Code:
      . reg ln_wage i.idcode white
      note: white omitted because of collinearity.
      
            Source |       SS           df       MS      Number of obs   =     3,253
      -------------+----------------------------------   F(540, 2712)    =      6.68
             Model |  385.907807       540  .714644087   Prob > F        =    0.0000
          Residual |  290.351401     2,712  .107061726   R-squared       =    0.5707
      -------------+----------------------------------   Adj R-squared   =    0.4852
             Total |  676.259208     3,252  .207951786   Root MSE        =     .3272
      
      ------------------------------------------------------------------------------
           ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            idcode |
                2  |  -.3523788     .13358    -2.64   0.008    -.6143078   -.0904499
                3  |  -.4556787   .1267251    -3.60   0.000    -.7041663   -.2071911
                4  |   -.045801   .1365822    -0.34   0.737    -.3136167    .2220147
                5  |  -.1918034   .1365822    -1.40   0.160    -.4596191    .0760123
                6  |  -.2502497   .1267251    -1.97   0.048    -.4987373   -.0017621
                7  |  -.6753346    .149347    -4.52   0.000      -.96818   -.3824891
                9  |   .0231601    .130986     0.18   0.860    -.2336823    .2800025
               10  |  -.7139727   .1442829    -4.95   0.000    -.9968883   -.4310571
               12  |   .8825671   .2499053     3.53   0.000     .3925429    1.372591
               13  |   .2013828   .1365822     1.47   0.140    -.0664329    .4691985
               14  |   .0548292   .2499053     0.22   0.826    -.4351949    .5448534
               15  |   .4466563    .149347     2.99   0.003     .1538108    .7395017
               16  |   .1003248    .130986     0.77   0.444    -.1565176    .3571672
               17  |   .1422147   .1636014     0.87   0.385    -.1785814    .4630108
               18  |  -.5251419   .1889107    -2.78   0.005    -.8955654   -.1547185
               19  |  -.0573663     .13358    -0.43   0.668    -.3192953    .2045626
               20  |   .0461612    .130986     0.35   0.725    -.2106812    .3030036
      ...
      ... [redacted for brevity]
      ...
              580  |  -.3807083   .2499053    -1.52   0.128    -.8707325    .1093158
              582  |   .5842159   .2112086     2.77   0.006     .1700699    .9983619
              584  |  -.0020959   .1442829    -0.01   0.988    -.2850115    .2808197
              585  |  -.0182903     .13358    -0.14   0.891    -.2802192    .2436386
              586  |  -.1395406   .1442829    -0.97   0.334    -.4224562     .143375
              587  |   .9384173   .1636014     5.74   0.000     .6176212    1.259213
              588  |   .1404795   .2499053     0.56   0.574    -.3495447    .6305036
              589  |  -.2656911   .1442829    -1.84   0.066    -.5486067    .0172245
              590  |  -.5892505   .1889107    -3.12   0.002    -.9596739    -.218827
              591  |  -.0619256   .1442829    -0.43   0.668    -.3448412    .2209899
              593  |  -.6407163   .3405636    -1.88   0.060    -1.308507     .027074
              594  |  -.7191776   .3405636    -2.11   0.035    -1.386968   -.0513873
              595  |  -.3248108     .13358    -2.43   0.015    -.5867398   -.0628819
              596  |  -.1260314   .1365822    -0.92   0.356    -.3938471    .1417843
              597  |  -.0203266    .174167    -0.12   0.907    -.3618401     .321187
              598  |  -.4464315   .2112086    -2.11   0.035    -.8605775   -.0322856
              599  |  -.2415512   .1442829    -1.67   0.094    -.5244667    .0413644
                   |
             white |          0  (omitted)
             _cons |   2.040434   .0944553    21.60   0.000     1.855222    2.225645
      ------------------------------------------------------------------------------

      Comment


      • #4
        Clyde Schechter makes sense, many thanks.

        Experimenting with non-linear count models like so:

        Code:
        use http://www.stata-press.com/data/r15/nlswork.dta, clear
        generate white = race == 1
        keep if idcode < 600
        xtset idcode
        
        
        bysort idcode: generate numObs = _N
        drop if numObs == 1
        
        //eststo nl1: xtpoisson wks_work white, fe
        eststo nl2: poisson wks_work i.idcode white 
        eststo nl3: xtnbreg wks_work white, fe
        eststo nl4: nbreg wks_work i.idcode white 
        
        
        esttab nl*, keep(white) cells(b se p) stats(N_g N)

        xtpoisson, poisson, nbreg cannot estimate a coefficient for white (as expected). However, xtnbreg does. Not sure what is happening here.


        Outputs are:

        Code:
        . eststo nl1: xtpoisson wks_work white, fe
        note: white dropped because it is constant within group
        independent variables required with fixed-effects model
        r(102);
        
        . esttab nl*, keep(white) cells(b se p) stats(N_g N)
        
        ---------------------------------------------------
                              (1)          (2)          (3)
                         wks_work     wks_work     wks_work
                           b/se/p       b/se/p       b/se/p
        ---------------------------------------------------
        wks_work                                           
        white                   0     .1037082            0
                                .     .0840918            .
                                .     .2174734            .
        ---------------------------------------------------
        N_g                                473             
        N                    3121         3121         3121
        ---------------------------------------------------

        Comment


        • #5
          -xtnbreg, fe- is not a fixed-effects model in the usual sense. The fixed-effects apply only to the dispersion parameter, not the full model. That is why -xtnbreg, fe- provides an estimate for white: there are no "fixed effects" in that part of the model, and the part of the model that does involve the fixed effects, the dispersion parameter, has nothing to do with the white variable.

          Comment


          • #6
            Clyde Schechter, many thanks. I assume that Stata manual would be the best place to get more details on what you have explained here. I will look over there.

            Comment

            Working...
            X