Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adjusted predictions for interaction in within-regression (-xtreg ,fe- and -margins-)

    Dear Listers,

    I am running a linear fixed-effects (within) regression model including an interaction between one variable that is constant within panels and one variable that varies within panels.

    I want to inspect this interaction graphically and therefore want to obtain adjusted predictions using the margins command. However, Stata seems to be unable to estimate what I want. Here is an illustrating example:

    Code:
    webuse nlswork ,clear
    xtset idcode year
    
    xtreg ln_wage c.hours##i.collgrad ,fe
    margins collgrad ,at(hours = (20(10)60))
    The (relevant) output is

    Code:
    . xtreg ln_wage c.hours##i.collgrad ,fe
    note: 1.collgrad omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs      =     28467
    Group variable: idcode                          Number of groups   =      4710
    
    R-sq:  within  = 0.0011                         Obs per group: min =         1
           between = 0.1868                                        avg =       6.0
           overall = 0.1034                                        max =        15
    
                                                    F(2,23755)         =     13.05
    corr(u_i, Xb)  = -0.4978                        Prob > F           =    0.0000
    
    ----------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
               hours |   .0010481   .0002818     3.72   0.000     .0004957    .0016004
          1.collgrad |          0  (omitted)
                     |
    collgrad#c.hours |
                  1  |  -.0030347   .0006335    -4.79   0.000    -.0042764   -.0017931
                     |
               _cons |    1.65641   .0094353   175.55   0.000     1.637916    1.674904
    -----------------+----------------------------------------------------------------
             sigma_u |  .44574269
             sigma_e |  .32025546
                 rho |  .65954017   (fraction of variance due to u_i)
    ----------------------------------------------------------------------------------
    F test that all u_i=0:     F(4709, 23755) =     6.91         Prob > F = 0.0000
    
    .
    . margins collgrad ,at(hours = (20(10)60))
    
    Adjusted predictions                              Number of obs   =      28467
    Model VCE    : Conventional
    
    Expression   : Linear prediction, predict()
    
    1._at        : hours           =          20
    
    2._at        : hours           =          30
    
    3._at        : hours           =          40
    
    4._at        : hours           =          50
    
    5._at        : hours           =          60
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    _at#collgrad |
            1 0  |          .  (not estimable)
            1 1  |          .  (not estimable)
            2 0  |          .  (not estimable)
            2 1  |          .  (not estimable)
            3 0  |          .  (not estimable)
            3 1  |          .  (not estimable)
            4 0  |          .  (not estimable)
            4 1  |          .  (not estimable)
            5 0  |          .  (not estimable)
            5 1  |          .  (not estimable)
    ------------------------------------------------------------------------------
    I do not really understand what the problem is. Is there a theoretical/statistical reason the requested marginal effects cannot be estimated? Are these effects not properly defined in this case? Am I missing crucial conceptual issues here?

    Or is this result merely a 'technical' problem?

    Any thoughts?

    Your time is appreciated.
    Daniel


    I am using Stata 12.1, fully updated, on a Windows 7 32Bit machine.
    Last edited by daniel klein; 12 Aug 2014, 14:37.

  • #2
    I suspect including collgrad as a main effect is causing the problem.
    Note that 1.collgrad was omitted because it is constant within
    idcode, which is the same as saying it is collinear with idcode.

    Respecify the model without a main effect for collgrad and things work
    out much more numerically stably:

    Code:
    . xtreg ln_wage hours c.hours#i.collgrad ,fe
    
    Fixed-effects (within) regression               Number of obs      =     28467
    Group variable: idcode                          Number of groups   =      4710
    
    R-sq:  within  = 0.0011                         Obs per group: min =         1
           between = 0.1868                                        avg =       6.0
           overall = 0.1034                                        max =        15
    
                                                    F(2,23755)         =     13.05
    corr(u_i, Xb)  = -0.4978                        Prob > F           =    0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           hours |   .0010481   .0002818     3.72   0.000     .0004957    .0016004
                 |
        collgrad#|
         c.hours |
              1  |  -.0030347   .0006335    -4.79   0.000    -.0042764   -.0017931
                 |
           _cons |    1.65641   .0094353   175.55   0.000     1.637916    1.674904
    -------------+----------------------------------------------------------------
         sigma_u |  .44574269
         sigma_e |  .32025546
             rho |  .65954017   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    F test that all u_i=0:     F(4709, 23755) =     6.91         Prob > F = 0.0000
    
    . margins collgrad ,at(hours = (20(10)60))
    
    Adjusted predictions                              Number of obs   =      28467
    Model VCE    : Conventional
    
    Expression   : Linear prediction, predict()
    
    1._at        : hours           =          20
    
    2._at        : hours           =          30
    
    3._at        : hours           =          40
    
    4._at        : hours           =          50
    
    5._at        : hours           =          60
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    _at#collgrad |
            1 0  |   1.677371   .0049988   335.55   0.000     1.667574    1.687169
            1 1  |   1.616676   .0116339   138.96   0.000     1.593874    1.639478
            2 0  |   1.687852   .0040983   411.85   0.000     1.679819    1.695884
            2 1  |   1.596809   .0159708    99.98   0.000     1.565507    1.628111
            3 0  |   1.698332   .0049485   343.20   0.000     1.688633    1.708031
            3 1  |   1.576942    .020956    75.25   0.000     1.535869    1.618015
            4 0  |   1.708813   .0069327   246.48   0.000     1.695225    1.722401
            4 1  |   1.557075   .0262224    59.38   0.000      1.50568     1.60847
            5 0  |   1.719293   .0093553   183.78   0.000     1.700957    1.737629
            5 1  |   1.537208   .0316298    48.60   0.000     1.475215    1.599202
    ------------------------------------------------------------------------------

    Comment


    • #3
      Interesting. I just automatically include all main effects with any interaction terms -- but with a fixed effects model you have to resist the urge to do that with time-invariant variables.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      Stata Version: 17.0 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Jeff, thank you very much for your answer.This is essentially what I had in mind when asking for a purly 'technical' problem. However, it did never occur to me the simple solution to use the single # operator.

        I was under the impression that using the double ## results in the same model. Stata does a great job afterall. It creates all the lower order terms, correctly recognizes that one of them does not vary within panel-units and excludes it from the model. I therefore have a last question. When you say leaving out the lower order term by hand makes the approach 'numerically stable', does this imply the possibility that estimation results might differ depending who excludes collinear variables: Stata or me?

        Thanks again.
        Daniel
        Last edited by daniel klein; 13 Aug 2014, 01:23. Reason: layout and typos

        Comment


        • #5
          In the current case it should not matter for the estimation results whether Stata omits time-invariant regressors automatically or you omit them manually. There are other cases, where the results might differ:

          Consider the case where you specify a full set of dummy variables in addition to the constant. Stata will drop one of the dummies but you do not have the control which one it chooses. Depending on the estimation purpose, it might be better in this case to manually drop one of the dummies to obtain "stable" results.

          If that is not what Jeff has in mind, I would be interested in further insights, too.
          https://twitter.com/Kripfganz

          Comment


          • #6
            Sebastian,

            I see your point that it might be desirable to decide which of the collinear variables to exclude for substantial reasons.

            This is, however, not what I meant. In the case of indicator variables representing one categorical variable, the models are essentially the same (i.e. equivalent) in terms of computation, no mater which indicator we chose to exclude. I was more wondering whether Stata would estimate higher standard errors or worse different point estimates for the same predictors in the same model. I cannot imagine this could possibly be the case, but it is always better to ask.

            Best
            Daniel
            Last edited by daniel klein; 13 Aug 2014, 02:06.

            Comment


            • #7
              When you say leaving out the lower order term by hand makes the approach 'numerically stable', does this imply the possibility that estimation results might differ depending who excludes collinear variables: Stata or me?
              Not at all. The estimation is stable.

              I meant that computation of the H would be more numerically stable.
              This H matrix is documented in the 'Estimable function' subsection of the
              'Methods and formulas' for margins. margins uses this matrix to determine
              which of its margin calculations are estimable. H can be sensitive to collinear
              factor variables when not all the information is present in e(b), as in the case
              with xtreg, fe where the fixed effect is not explicitly part of the estimated
              parameters.

              Comment


              • #8
                Thanks for clarifying, Jeff. Will have a look into the cited Methods and Formulars section. Best Daniel

                Comment


                • #9
                  Dear all,

                  can I bring back this discussion and ask the same question with i.hours instead of c.hours in the model?

                  Code:
                  webuse nlswork, clear
                  xtset idcode year
                  keep if hours<50
                  xtreg ln_wage i.hours#i.collgrad, fe
                  margins, dydx(hours) at(collgrad=(0 1))


                  Is there any trick how to estimate the marginal effects? Additionally, I would like to have hours=10 as reference level, is this possible?
                  Thanks a lot in advance!
                  Felizia

                  Comment


                  • #10
                    Originally posted by Felizia Hanemann View Post
                    Dear all,

                    can I bring back this discussion and ask the same question with i.hours instead of c.hours in the model?

                    Code:
                    webuse nlswork, clear
                    xtset idcode year
                    keep if hours<50
                    xtreg ln_wage i.hours#i.collgrad, fe
                    margins, dydx(hours) at(collgrad=(0 1))


                    Is there any trick how to estimate the marginal effects? Additionally, I would like to have hours=10 as reference level, is this possible?
                    Thanks a lot in advance!
                    Felizia
                    As this thread is referred to in several other threads, I would like to follow-up on Felizias question. Is there a way to estimate margins after fixed-effects with interacted categorical variables (of which one is time-invariant)?

                    Comment


                    • #11
                      Dear all,

                      I also need to bring back that old discussion once again. I'm interested in plotting how my outcome changes over 12 months separately for different treatment intensities. When I initially used the fixed effect regression

                      Code:
                      xtreg outcome c.T##i.date, fe robust
                      margins date, at(T=(-1,0,1))
                      marginsplot
                      no margins could be calculated and searching for the reason I found this thread. I now understand that I must not include in my regression the main effects of anything which would be perfectly collinear with the fixed effects, as Stata seems not to take care of collinear terms automatically in this case.
                      So I adjusted my code to

                      Code:
                      xtreg outcome c.T#i.date i.date, fe robust
                      margins date, at(T=(-1,0,1))
                      marginsplot
                      as treatment intensity T does increase not vary within my cross-sectional units. However, contrary to what I expected after reading this thread I still do not get margins, regardless of whether I omit both, either or none of the main effects.

                      Would anyone happen to see what else could be the reason that "margins cannot be estimated" still?

                      Thanks so much, PM



                      Comment

                      Working...
                      X