Adjusted predictions for interaction in within-regression (-xtreg ,fe- and -margins-)

daniel klein

Join Date: Mar 2014
Posts: 3850

Adjusted predictions for interaction in within-regression (-xtreg ,fe- and -margins-)

12 Aug 2014, 14:34

Dear Listers,

I am running a linear fixed-effects (within) regression model including an interaction between one variable that is constant within panels and one variable that varies within panels.

I want to inspect this interaction graphically and therefore want to obtain adjusted predictions using the margins command. However, Stata seems to be unable to estimate what I want. Here is an illustrating example:

Code:

webuse nlswork ,clear
xtset idcode year

xtreg ln_wage c.hours##i.collgrad ,fe
margins collgrad ,at(hours = (20(10)60))

The (relevant) output is

Code:

. xtreg ln_wage c.hours##i.collgrad ,fe
note: 1.collgrad omitted because of collinearity

Fixed-effects (within) regression               Number of obs      =     28467
Group variable: idcode                          Number of groups   =      4710

R-sq:  within  = 0.0011                         Obs per group: min =         1
       between = 0.1868                                        avg =       6.0
       overall = 0.1034                                        max =        15

                                                F(2,23755)         =     13.05
corr(u_i, Xb)  = -0.4978                        Prob > F           =    0.0000

----------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
           hours |   .0010481   .0002818     3.72   0.000     .0004957    .0016004
      1.collgrad |          0  (omitted)
                 |
collgrad#c.hours |
              1  |  -.0030347   .0006335    -4.79   0.000    -.0042764   -.0017931
                 |
           _cons |    1.65641   .0094353   175.55   0.000     1.637916    1.674904
-----------------+----------------------------------------------------------------
         sigma_u |  .44574269
         sigma_e |  .32025546
             rho |  .65954017   (fraction of variance due to u_i)
----------------------------------------------------------------------------------
F test that all u_i=0:     F(4709, 23755) =     6.91         Prob > F = 0.0000

.
. margins collgrad ,at(hours = (20(10)60))

Adjusted predictions                              Number of obs   =      28467
Model VCE    : Conventional

Expression   : Linear prediction, predict()

1._at        : hours           =          20

2._at        : hours           =          30

3._at        : hours           =          40

4._at        : hours           =          50

5._at        : hours           =          60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at#collgrad |
        1 0  |          .  (not estimable)
        1 1  |          .  (not estimable)
        2 0  |          .  (not estimable)
        2 1  |          .  (not estimable)
        3 0  |          .  (not estimable)
        3 1  |          .  (not estimable)
        4 0  |          .  (not estimable)
        4 1  |          .  (not estimable)
        5 0  |          .  (not estimable)
        5 1  |          .  (not estimable)
------------------------------------------------------------------------------

I do not really understand what the problem is. Is there a theoretical/statistical reason the requested marginal effects cannot be estimated? Are these effects not properly defined in this case? Am I missing crucial conceptual issues here?

Or is this result merely a 'technical' problem?

Any thoughts?

Your time is appreciated.
Daniel

I am using Stata 12.1, fully updated, on a Windows 7 32Bit machine.

Last edited by daniel klein; 12 Aug 2014, 14:37.

Tags: fixed effects, interaction, Marginal Effects, margins, xtreg

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 700

12 Aug 2014, 14:58

I suspect including collgrad as a main effect is causing the problem.
Note that 1.collgrad was omitted because it is constant within
idcode, which is the same as saying it is collinear with idcode.

Respecify the model without a main effect for collgrad and things work
out much more numerically stably:

Code:

. xtreg ln_wage hours c.hours#i.collgrad ,fe

Fixed-effects (within) regression               Number of obs      =     28467
Group variable: idcode                          Number of groups   =      4710

R-sq:  within  = 0.0011                         Obs per group: min =         1
       between = 0.1868                                        avg =       6.0
       overall = 0.1034                                        max =        15

                                                F(2,23755)         =     13.05
corr(u_i, Xb)  = -0.4978                        Prob > F           =    0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       hours |   .0010481   .0002818     3.72   0.000     .0004957    .0016004
             |
    collgrad#|
     c.hours |
          1  |  -.0030347   .0006335    -4.79   0.000    -.0042764   -.0017931
             |
       _cons |    1.65641   .0094353   175.55   0.000     1.637916    1.674904
-------------+----------------------------------------------------------------
     sigma_u |  .44574269
     sigma_e |  .32025546
         rho |  .65954017   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0:     F(4709, 23755) =     6.91         Prob > F = 0.0000

. margins collgrad ,at(hours = (20(10)60))

Adjusted predictions                              Number of obs   =      28467
Model VCE    : Conventional

Expression   : Linear prediction, predict()

1._at        : hours           =          20

2._at        : hours           =          30

3._at        : hours           =          40

4._at        : hours           =          50

5._at        : hours           =          60

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at#collgrad |
        1 0  |   1.677371   .0049988   335.55   0.000     1.667574    1.687169
        1 1  |   1.616676   .0116339   138.96   0.000     1.593874    1.639478
        2 0  |   1.687852   .0040983   411.85   0.000     1.679819    1.695884
        2 1  |   1.596809   .0159708    99.98   0.000     1.565507    1.628111
        3 0  |   1.698332   .0049485   343.20   0.000     1.688633    1.708031
        3 1  |   1.576942    .020956    75.25   0.000     1.535869    1.618015
        4 0  |   1.708813   .0069327   246.48   0.000     1.695225    1.722401
        4 1  |   1.557075   .0262224    59.38   0.000      1.50568     1.60847
        5 0  |   1.719293   .0093553   183.78   0.000     1.700957    1.737629
        5 1  |   1.537208   .0316298    48.60   0.000     1.475215    1.599202
------------------------------------------------------------------------------

Comment

Richard Williams

Join Date: Apr 2014

Posts: 4994
#3

12 Aug 2014, 15:18

Interesting. I just automatically include all main effects with any interaction terms -- but with a fixed effects model you have to resist the urge to do that with time-invariant variables.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
daniel klein

Join Date: Mar 2014

Posts: 3850
#4

13 Aug 2014, 00:48

Jeff, thank you very much for your answer.This is essentially what I had in mind when asking for a purly 'technical' problem. However, it did never occur to me the simple solution to use the single # operator.

I was under the impression that using the double ## results in the same model. Stata does a great job afterall. It creates all the lower order terms, correctly recognizes that one of them does not vary within panel-units and excludes it from the model. I therefore have a last question. When you say leaving out the lower order term by hand makes the approach 'numerically stable', does this imply the possibility that estimation results might differ depending who excludes collinear variables: Stata or me?

Thanks again.
Daniel

Last edited by daniel klein; 13 Aug 2014, 01:23. Reason: layout and typos
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#5

13 Aug 2014, 01:01

In the current case it should not matter for the estimation results whether Stata omits time-invariant regressors automatically or you omit them manually. There are other cases, where the results might differ:

Consider the case where you specify a full set of dummy variables in addition to the constant. Stata will drop one of the dummies but you do not have the control which one it chooses. Depending on the estimation purpose, it might be better in this case to manually drop one of the dummies to obtain "stable" results.

If that is not what Jeff has in mind, I would be interested in further insights, too.

https://www.kripfganz.de/stata/
Comment
daniel klein

Join Date: Mar 2014

Posts: 3850
#6

13 Aug 2014, 01:36

Sebastian,

I see your point that it might be desirable to decide which of the collinear variables to exclude for substantial reasons.

This is, however, not what I meant. In the case of indicator variables representing one categorical variable, the models are essentially the same (i.e. equivalent) in terms of computation, no mater which indicator we chose to exclude. I was more wondering whether Stata would estimate higher standard errors or worse different point estimates for the same predictors in the same model. I cannot imagine this could possibly be the case, but it is always better to ask.

Best
Daniel

Last edited by daniel klein; 13 Aug 2014, 02:06.
Comment
Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 700
#7

13 Aug 2014, 10:37

When you say leaving out the lower order term by hand makes the approach 'numerically stable', does this imply the possibility that estimation results might differ depending who excludes collinear variables: Stata or me?

Not at all. The estimation is stable.

I meant that computation of the H would be more numerically stable.
This H matrix is documented in the 'Estimable function' subsection of the
'Methods and formulas' for margins. margins uses this matrix to determine
which of its margin calculations are estimable. H can be sensitive to collinear
factor variables when not all the information is present in e(b), as in the case
with xtreg, fe where the fixed effect is not explicitly part of the estimated
parameters.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3850
#8

13 Aug 2014, 10:49

Thanks for clarifying, Jeff. Will have a look into the cited Methods and Formulars section. Best Daniel
Comment
Felizia Hanemann

Join Date: Aug 2017

Posts: 1
#9

02 Aug 2017, 06:28

Dear all,

can I bring back this discussion and ask the same question with i.hours instead of c.hours in the model?

Code:

webuse nlswork, clear xtset idcode year keep if hours<50 xtreg ln_wage i.hours#i.collgrad, fe margins, dydx(hours) at(collgrad=(0 1))

Is there any trick how to estimate the marginal effects? Additionally, I would like to have hours=10 as reference level, is this possible?
Thanks a lot in advance!
Felizia
Comment
Renke Schmacker

Join Date: May 2018

Posts: 1
#10

29 May 2018, 07:49

Originally posted by Felizia Hanemann View Post

Dear all,

can I bring back this discussion and ask the same question with i.hours instead of c.hours in the model?

Code:

webuse nlswork, clear xtset idcode year keep if hours<50 xtreg ln_wage i.hours#i.collgrad, fe margins, dydx(hours) at(collgrad=(0 1))

Is there any trick how to estimate the marginal effects? Additionally, I would like to have hours=10 as reference level, is this possible?
Thanks a lot in advance!
Felizia

As this thread is referred to in several other threads, I would like to follow-up on Felizias question. Is there a way to estimate margins after fixed-effects with interacted categorical variables (of which one is time-invariant)?
Comment
Peter Meier

Join Date: Apr 2016

Posts: 77
#11

28 May 2021, 06:15

Dear all,

I also need to bring back that old discussion once again. I'm interested in plotting how my outcome changes over 12 months separately for different treatment intensities. When I initially used the fixed effect regression

Code:

xtreg outcome c.T##i.date, fe robust margins date, at(T=(-1,0,1)) marginsplot

no margins could be calculated and searching for the reason I found this thread. I now understand that I must not include in my regression the main effects of anything which would be perfectly collinear with the fixed effects, as Stata seems not to take care of collinear terms automatically in this case.
So I adjusted my code to

Code:

xtreg outcome c.T#i.date i.date, fe robust margins date, at(T=(-1,0,1)) marginsplot

as treatment intensity T does increase not vary within my cross-sectional units. However, contrary to what I expected after reading this thread I still do not get margins, regardless of whether I omit both, either or none of the main effects.

Would anyone happen to see what else could be the reason that "margins cannot be estimated" still?

Thanks so much, PM
Comment

Announcement

Adjusted predictions for interaction in within-regression (-xtreg ,fe- and -margins-)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment