Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coefficients omitted in SSC -reghdfe- but not in -regress- : understanding the difference

    Dear Statalist members,

    I am encountering a puzzling issue where some coefficients are being omitted when using reghdfe but remain included when using the standard regress command. I would appreciate guidance on understanding why this occurs and how to address it.

    Problem Description:
    When I run my regression using regress, all variables are included in the output. However, when I use reghdfe with the same specification, some coefficients are dropped with the note "omitted" in the results table. Could anyone give me an explanation to that, please? Here is the output:

    1. With -regress-:

    Code:
    . regress d_log_pop ss_wind_ccaa new_inst_cap_wind_ccaa com_wind_power_dens_std i.com_id i.year, robust
    note: 324.com_id omitted because of collinearity.
    
    Linear regression                               Number of obs     =      6,904
                                                    F(338, 6565)      =      40.33
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.6568
                                                    Root MSE          =     .00958
    
    -----------------------------------------------------------------------------------------
                            |               Robust
                  d_log_pop | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    ------------------------+----------------------------------------------------------------
               ss_wind_ccaa |  -1.97e-08   4.27e-07    -0.05   0.963    -8.57e-07    8.18e-07
     new_inst_cap_wind_ccaa |   2.62e-07   6.58e-07     0.40   0.691    -1.03e-06    1.55e-06
    com_wind_power_dens_std |   .0022696   .0011517     1.97   0.049      .000012    .0045273
                            |
                     com_id |
                         2  |   .0123962   .0023602     5.25   0.000     .0077695    .0170229
                        ... |     ...               ...                ...       ...            ...               ...
                       324  |          0  (omitted)
                            |
                       year |
                      2002  |   .0023319   .0008865     2.63   0.009      .000594    .0040698
                      2003  |   .0022787   .0009542     2.39   0.017     .0004082    .0041492
                      2004  |   .0017792   .0007748     2.30   0.022     .0002604    .0032981
                      2005  |   .0082689   .0008766     9.43   0.000     .0065505    .0099873
                      2006  |   .0026888   .0008097     3.32   0.001     .0011014    .0042761
                      2007  |    .004634   .0009152     5.06   0.000       .00284     .006428
                      2008  |    .008827   .0008659    10.19   0.000     .0071295    .0105245
                      2009  |   .0000832   .0007619     0.11   0.913    -.0014104    .0015767
                      2010  |  -.0041022   .0007166    -5.72   0.000    -.0055069   -.0026974
                      2011  |  -.0031747   .0007821    -4.06   0.000    -.0047078   -.0016416
                      2012  |  -.0098826   .0007537   -13.11   0.000      -.01136   -.0084051
                      2013  |   -.016703   .0008054   -20.74   0.000    -.0182818   -.0151242
                      2014  |  -.0167145   .0009733   -17.17   0.000    -.0186225   -.0148065
                      2015  |  -.0130431   .0007768   -16.79   0.000    -.0145658   -.0115204
                      2016  |  -.0148028   .0007322   -20.22   0.000    -.0162381   -.0133674
                      2017  |  -.0130558   .0006758   -19.32   0.000    -.0143807    -.011731
                      2018  |  -.0106227   .0006497   -16.35   0.000    -.0118963   -.0093491
                      2019  |  -.0058817   .0006895    -8.53   0.000    -.0072334     -.00453
                      2020  |  -.0049631   .0006577    -7.55   0.000    -.0062523   -.0036738
                      2021  |  -.0030034   .0008531    -3.52   0.000    -.0046758   -.0013309
                      2022  |  -.0041286   .0007229    -5.71   0.000    -.0055458   -.0027114
                            |
                      _cons |   .0059756   .0013704     4.36   0.000     .0032892    .0086621
    -----------------------------------------------------------------------------------------

    2. And with -reghdfe-:

    Code:
    . reghdfe d_log_pop ss_wind_ccaa new_inst_cap_wind_ccaa com_pv_potential_std,  abs(com_id year) vce(robust)
    (MWFE estimator converged in 4 iterations)
    note: com_pv_potential_std is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    
    HDFE Linear regression                            Number of obs   =      6,904
    Absorbing 2 HDFE groups                           F(   2,   6565) =       0.08
                                                      Prob > F        =     0.9238
                                                      R-squared       =     0.6568
                                                      Adj R-squared   =     0.6391
                                                      Within R-sq.    =     0.0000
                                                      Root MSE        =     0.0096
    
    ----------------------------------------------------------------------------------------
                           |               Robust
                 d_log_pop | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
              ss_wind_ccaa |  -1.97e-08   4.27e-07    -0.05   0.963    -8.57e-07    8.18e-07
    new_inst_cap_wind_ccaa |   2.62e-07   6.58e-07     0.40   0.691    -1.03e-06    1.55e-06
      com_pv_potential_std |          0  (omitted)
                     _cons |   .0000523   .0001441     0.36   0.717    -.0002302    .0003349
    ----------------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
          com_id |       316           0         316     |
            year |        22           1          21     |
    -----------------------------------------------------+
    
    
    .
    end of do-file
    Thanks for your help.
    Best,

    Michael

  • #2
    Well, you can't compare these two regressions in the first place because they contain different variables. The -regress- version contains a variable com_wind_power_dens_std, where the -reghdfe- version has com_pv_potential_std.

    So disregarding the -regress- equation, I'll take it you are just wondering why com_pv_potential_std is omitted from -reghdfe-. Stata has actually told you why:
    note: com_pv_potential_std is probably collinear with the fixed effects (all partialled-out values are close to zero; tol = 1.0e-09)
    When you have a set of colinear variables in a linear model, something must be done to identify the model, and the default approach in both -regress- and -reghdfe- is to eliminate one (or more, if necessary) of them to break the colinearity. Different commands may make different choices, and the same command may choose differently depending on the order in which the variables appear in the command's varlist. -reghdfe- chose to eliminate com_pv_potential_std in order to disrupt the colinear relationship with the fixed effects.

    This is perfectly normal and you can just proceed normally from here.

    Comment


    • #3
      If the variable is constant across a fixed effect (say year), then the fixed effect will eat it. Say, for example, you had households in a county and you include county median income but also a county fixed effect. The fixed effect will cause the median income to be dropped as it doesn't vary within the county. You can use xtreg, cre (based on Wooldridge's Mundlak stuff) if you want a coefficient, but study up on it so you know what's what.

      Comment


      • #4
        Sorry,

        Thank you for pointing this out. I've copied and pasted this by hand for easier understanding, as it was a loop.
        Indeed, in the -reghdfe- regression, the variable should be com_wind_power_dens_std, not com_pv_potential_std. But the same logic applies: both are dropped from the regressions.

        T
        hank you for your explanations: Clyde Schechter and George Ford.
        Everything is much more clearer right now.

        All the best,
        Last edited by Michael Duarte Goncalves; 17 Jun 2025, 08:59.

        Comment

        Working...
        X