Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Omitted coefficients using reghdfe

    Hello,

    I am estimating regressions that include different fixed effects and variables whose coefficients I am not interested in. I want to control them in the estimations. To do so, I am using the command reghdfe. My understanding of this command is that it accounts for the variables declared in the "absorb" option, but it does not estimate their coefficients. Then, I would expect to obtain the same coefficients of the variables of interest when using the command reg, but controlling for all the variables (those inside and outside "absorb") and when using the command reghdfe.

    However, it is not the case; when estimating an equation using the command reg, I obtain a coefficient for each one of the variables of interest. When using the command reghdfe, it omits the coefficients of some of the variables of interest. If I use a big dataset, the estimated coefficients of non-omitted variables are the same as those obtained using reg. If the sample is small (such as the one below), the coefficients are quite different, and Stata omits most of the variables of interest.

    The following are examples of the estimations:

    Code:
    reg manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb i.cpuma0010 i.birthyr trend_*, vce(cluster stateyear)
    
    reghdfe manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb, absorb(cpuma0010 birthyr statefip#c.birthyr) vce(cluster stateyear)
    trend_* --> These are statefip-birthyr (state-year o birth) specific time trends. Therefore, they should be equivalent to what Stata would generate with statefip#c.birthyr


    The following is an example of the dataset I used to generate the estimations above:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(statefip birthyr manager nonblack admit20_nb admit_noexpbord20 admit20free_nb admit_exp20) int cpuma0010 float(stateyear trend_1 trend_2 trend_3 trend_4 trend_5 trend_6)
    12 1968 0 0 0 0 0 0 255 445    0 1968    0    0 0    0
    13 1966 0 0 0 1 0 1 262 491    0    0 1966    0 0    0
    15 1965 0 1 0 . . 0 284 538    0    0    0 1965 0    0
    15 1965 0 1 0 . . 0 288 538    0    0    0 1965 0    0
    12 1968 0 1 0 0 0 0 253 445    0 1968    0    0 0    0
    12 1964 0 1 0 1 1 0 204 441    0 1964    0    0 0    0
    15 1967 0 1 0 . . 0 284 540    0    0    0 1967 0    0
    12 1968 0 1 0 0 0 0 257 445    0 1968    0    0 0    0
    15 1968 0 1 0 . . 0 285 541    0    0    0 1968 0    0
    15 1967 0 1 0 . . 0 282 540    0    0    0 1967 0    0
    12 1967 0 1 0 0 0 0 239 444    0 1967    0    0 0    0
    17 1968 0 1 1 1 1 1 310 637    0    0    0    0 0 1968
    17 1965 0 0 0 1 0 1 335 634    0    0    0    0 0 1965
    15 1966 0 1 0 . . 0 283 539    0    0    0 1966 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    13 1967 0 1 1 1 1 1 273 492    0    0 1967    0 0    0
    17 1965 0 0 0 1 0 1 337 634    0    0    0    0 0 1965
    12 1967 0 1 0 0 0 0 239 444    0 1967    0    0 0    0
    11 1964 0 1 0 1 1 0 201 393 1964    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1965 0 0 0 1 0 0 201 394 1965    0    0    0 0    0
    11 1964 1 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1968 0 1 0 1 1 0 200 397 1968    0    0    0 0    0
    11 1966 0 1 0 1 1 0 200 395 1966    0    0    0 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1966 1 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1967 0 0 0 1 0 0 202 396 1967    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1965 0 0 0 1 0 0 202 394 1965    0    0    0 0    0
    11 1966 0 1 0 1 1 0 202 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1967 0 1 0 1 1 0 201 396 1967    0    0    0 0    0
    11 1966 0 0 0 1 0 0 202 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1965 0 0 0 1 0 0 202 394 1965    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1967 0 1 0 1 1 0 201 396 1967    0    0    0 0    0
    11 1968 0 0 0 1 0 0 202 397 1968    0    0    0 0    0
    11 1967 1 0 0 1 0 0 201 396 1967    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1968 1 1 0 1 1 0 200 397 1968    0    0    0 0    0
    11 1965 0 1 0 1 1 0 200 394 1965    0    0    0 0    0
    11 1966 0 0 0 1 0 0 201 395 1966    0    0    0 0    0
    11 1968 1 0 0 1 0 0 201 397 1968    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1965 0 0 0 1 0 0 202 394 1965    0    0    0 0    0
    11 1965 0 1 0 1 1 0 201 394 1965    0    0    0 0    0
    11 1966 0 1 0 1 1 0 200 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1966 0 0 0 1 0 0 201 395 1966    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1968 0 0 0 1 0 0 201 397 1968    0    0    0 0    0
    11 1965 0 0 0 1 0 0 201 394 1965    0    0    0 0    0
    11 1965 0 0 0 1 0 0 201 394 1965    0    0    0 0    0
    11 1965 0 0 0 1 0 0 202 394 1965    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1968 0 0 0 1 0 0 201 397 1968    0    0    0 0    0
    11 1967 0 0 0 1 0 0 202 396 1967    0    0    0 0    0
    11 1966 0 0 0 1 0 0 202 395 1966    0    0    0 0    0
    11 1968 1 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1966 0 0 0 1 0 0 202 395 1966    0    0    0 0    0
    11 1965 0 1 0 1 1 0 201 394 1965    0    0    0 0    0
    11 1965 0 0 0 1 0 0 201 394 1965    0    0    0 0    0
    11 1966 0 0 0 1 0 0 202 395 1966    0    0    0 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1968 1 1 0 1 1 0 200 397 1968    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1967 0 1 0 1 1 0 201 396 1967    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1965 0 0 0 1 0 0 201 394 1965    0    0    0 0    0
    11 1964 0 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1965 0 0 0 1 0 0 202 394 1965    0    0    0 0    0
    11 1966 1 1 0 1 1 0 200 395 1966    0    0    0 0    0
    11 1966 0 1 0 1 1 0 200 395 1966    0    0    0 0    0
    11 1968 1 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1968 0 0 0 1 0 0 202 397 1968    0    0    0 0    0
    11 1968 0 0 0 1 0 0 201 397 1968    0    0    0 0    0
    11 1967 0 0 0 1 0 0 202 396 1967    0    0    0 0    0
    11 1967 1 1 0 1 1 0 201 396 1967    0    0    0 0    0
    11 1966 0 0 0 1 0 0 202 395 1966    0    0    0 0    0
    11 1967 0 0 0 1 0 0 202 396 1967    0    0    0 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1968 0 0 0 1 0 0 202 397 1968    0    0    0 0    0
    11 1967 0 1 0 1 1 0 201 396 1967    0    0    0 0    0
    11 1967 0 0 0 1 0 0 201 396 1967    0    0    0 0    0
    11 1968 1 1 0 1 1 0 200 397 1968    0    0    0 0    0
    11 1966 0 1 0 1 1 0 201 395 1966    0    0    0 0    0
    11 1965 0 1 0 1 1 0 200 394 1965    0    0    0 0    0
    11 1967 0 0 0 1 0 0 202 396 1967    0    0    0 0    0
    11 1964 0 0 0 1 0 0 202 393 1964    0    0    0 0    0
    11 1964 0 1 0 1 1 0 201 393 1964    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1966 0 0 0 1 0 0 201 395 1966    0    0    0 0    0
    11 1964 1 0 0 1 0 0 201 393 1964    0    0    0 0    0
    11 1968 0 1 0 1 1 0 201 397 1968    0    0    0 0    0
    11 1966 0 0 0 1 0 0 201 395 1966    0    0    0 0    0
    11 1967 0 1 0 1 1 0 201 396 1967    0    0    0 0    0
    end
    label values statefip statefip_lbl
    label def statefip_lbl 11 "District of Columbia", modify
    label def statefip_lbl 12 "Florida", modify
    label def statefip_lbl 13 "Georgia", modify
    label def statefip_lbl 15 "Hawaii", modify
    label def statefip_lbl 17 "Illinois", modify


    The following are the results of such estimations:

    Code:
    . reg manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb i.cpuma0010 i.birthyr tren
    > d_*, vce(cluster stateyear)
    note: 255.cpuma0010 omitted because of collinearity
    note: 257.cpuma0010 omitted because of collinearity
    note: 310.cpuma0010 omitted because of collinearity
    note: 337.cpuma0010 omitted because of collinearity
    note: trend_1 omitted because of collinearity
    note: trend_2 omitted because of collinearity
    note: trend_3 omitted because of collinearity
    note: trend_4 omitted because of collinearity
    note: trend_5 omitted because of collinearity
    
    Linear regression                               Number of obs     =         94
    F(2, 11)          =          .
    Prob > F          =          .
    R-squared         =     0.1838
    Root MSE          =      .3353
    
    (Std. Err. adjusted for 12 clusters in stateyear)
    
    Robust
    manager       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]
    
    nonblack   -1.02e-14   1.19e-14    -0.85   0.411    -3.63e-14    1.60e-14
    admit_exp20   -.2653784   .1433278    -1.85   0.091    -.5808408     .050084
    admit20_nb   -.2726897   .0501327    -5.44   0.000     -.383031   -.1623483
    admit_noexpbord20    .5479404   .1346726     4.07   0.002      .251528    .8443529
    admit20free_nb   -.0098723   .0563795    -0.18   0.864    -.1339628    .1142181
    
    cpuma0010
    201    -.2675532   .1452092    -1.84   0.092    -.5871564      .05205
    202    -.3605872   .1864338    -1.93   0.079    -.7709252    .0497509
    204    -.4246173   .1518185    -2.80   0.017    -.7587674   -.0904671
    239     .0862023   .0371662     2.32   0.041        .0044    .1680046
    253     7.72e-16   1.02e-15     0.75   0.466    -1.48e-15    3.02e-15
    255            0  (omitted)
    257            0  (omitted)
    262    -.0786729   .0147373    -5.34   0.000    -.1111095   -.0462362
    273     .0862023   .0371662     2.32   0.041        .0044    .1680046
    310            0  (omitted)
    335    -5.81e-15   7.51e-15    -0.77   0.456    -2.23e-14    1.07e-14
    337            0  (omitted)
    
    birthyr
    1965    -.1691112   .0222369    -7.60   0.000    -.2180542   -.1201682
    1966    -.0904383   .0334711    -2.70   0.021    -.1641078   -.0167689
    1967     .0272485   .0172123     1.58   0.142    -.0106356    .0651326
    1968     .1134508   .0399003     2.84   0.016     .0256308    .2012708
    
    trend_1           0  (omitted)
    trend_2           0  (omitted)
    trend_3           0  (omitted)
    trend_4           0  (omitted)
    trend_5           0  (omitted)
    _cons   -.1134508   .0399003    -2.84   0.016    -.2012708   -.0256308
    
    
    
    . reghdfe manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb, absorb(cpuma0010 birt
    > hyr statefip#c.birthyr) vce(cluster stateyear)
    (dropped 9 singleton observations)
    note: admit_noexpbord20 is probably collinear with the fixed effects (all partialled-out values are close
    >  to zero; tol = 1.0e-09)
    (MWFE estimator converged in 5 iterations)
    note: admit_exp20 omitted because of collinearity
    note: admit20_nb omitted because of collinearity
    note: admit_noexpbord20 omitted because of collinearity
    note: admit20free_nb omitted because of collinearity
    
    HDFE Linear regression                            Number of obs   =         85
    Absorbing 3 HDFE groups                           F(   1,      5) =       0.03
    Statistics robust to heteroskedasticity           Prob > F        =     0.8692
    R-squared       =     0.1709
    Adj R-squared   =     0.0589
    Within R-sq.    =     0.0001
    Number of clusters (stateyear) =          6       Root MSE        =     0.3398
    
    (Std. Err. adjusted for 6 clusters in stateyear)
    
    Robust
    manager       Coef.   Std. Err.      t    P>t     [95% Conf. Interval]
    
    nonblack   -.0098723   .0569517    -0.17   0.869    -.1562713    .1365266
    admit_exp20           0  (omitted)
    admit20_nb           0  (omitted)
    admit_noexpbord20           0  (omitted)
    admit20free_nb           0  (omitted)
    _cons      .14559   .0254608     5.72   0.002      .080141    .2110389
    
    
    Absorbed degrees of freedom:
    
    Absorbed FE  Categories  - Redundant  = Num. Coefs
    -
    cpuma0010          4           0           4    
    birthyr          5           1           4    
    statefip#c.birthyr          2           0           2    ?
    
    ? = number of redundant parameters may be higher
    As you can see, when using reghdfe, Stata omits all of the variables of interest, except for one of them, while when using reg, I obtain a coefficient for each one of these variables.


    So, what I would like to understand is:

    1) What are the differences between reg and reghdfe that are generating different coefficients?

    1) Why does reghdfe omit variables, whereas reg estimated coefficients for each variable?

    2) How can I prevent Stata from omitting the variables of interest and instead make it omit the coefficients of the variables I declared in "absorb"?

    I would appreciate any help!
    Last edited by Mayra Pineda; 13 Jan 2020, 20:31.

  • #2
    reghdfe is from SSC as you are asked to explain (refer to FAQ advice #12).

    1) What are the differences between reg and reghdfe that are generating different coefficients?
    1) Why does reghdfe omit variables, whereas reg estimated coefficients for each variable?
    The problem with estimating a fixed effects model with regress is that you do not know which variables are collinear with the fixed effects unless you change the omitted categories of the fixed effects and check whether your coefficient estimates of the time-varying variables change. Compare

    Code:
    reg manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb i.cpuma0010 i.birthyr trend_*, vce(cluster stateyear)
    reg manager nonblack admit_exp20 admit20_nb admit_noexpbord20 admit20free_nb ib201.cpuma0010 i.birthyr trend_*, vce(cluster stateyear)
    The fact that the coefficients on your time-varying variables change as you change the base level of one of your fixed effects variables is indication that these time-varying variables are collinear with the fixed effects. reghdfe will give you a direct indication of this collinearity.

    2) How can I prevent Stata from omitting the variables of interest and instead make it omit the coefficients of the variables I declared in "absorb"?
    There is no need to do this once you understand what is going on. The bottom line is that you cannot identify the effects of variables that are collinear with the fixed effects.

    Comment


    • #3
      Thanks for the explanation, Andrew!

      Comment

      Working...
      X