Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • margins with predict(xbu) to take account of fixed effect

    I would like to generate a table of margins in order to visualize a panel data regression model estimated via xtreg, fe. However, the margins command does not allow me to use the option predict(xbu), which would be needed to incorporate the fixed effect into the margins. Here is a (simplified) example:
    Code:
    wbopendata, language(en - English) country() topics() indicator(ST.INT.ARVL; SE.ADT.LITR.ZS) long clear
    encode countryname, generate(ctry)
    xtset ctry year
    xtreg se_adt_litr_zs st_int_arvl year, fe cl(ctry)
    predict se_adt_litr_zs_hat if e(sample)
    predict se_adt_litr_zs_hat_fe if e(sample), xbu
    margins, at(st_int_arvl=(1000000(1000000)10000000)) // This works (ignore the warning)
    margins, at(st_int_arvl=(1000000(1000000)10000000)) predict(xbu) // This is needed
    Obviously, the predicted values se_adt_litr_zs_hat differ from se_adt_litr_zs_hat_fe, with the latter ones taking the fixed effect into account. Is there an alternative to estimate margins at specified values of the indepedent variable -- an alternative which includes the fixed effect?

    Best regards,
    Sebastian van Baal

  • #2
    i don't know, but it is an ongoing source of irritation to me that, unlike predict, the help for margins is not customized for each estimation command. I figure you just have to read the help for predict to see what margins can do, but in this case that apparently does not work -- xbu is ok as a predict option after xtreg but margins after xtreg still does not like it. There may be very good reasons for this, but, as far as I can tell, they aren't explained anywhere.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      With this code, I can reproduce margins estimates using xb via predict commands:

      Code:
      webuse nlswork, clear
      clonevar xhours = hours
      xtreg ln_wage xhours age , fe
      drop if !e(sample)
      margins, at(xhours = (0 (40) 160) )
      replace xhours = 0
      predict p00, xb
      replace xhours = 40
      predict p40, xb
      replace xhours = 80
      predict p80, xb
      replace xhours = 120
      predict p120, xb
      replace xhours = 160
      predict p160, xb
      sum p00 p40 p80 p120 p160
      Key Output:

      Code:
      . margins, at(xhours = (0 (40) 160) )
      
      Predictive margins                                Number of obs   =      28443
      Model VCE    : Conventional
      
      Expression   : Linear prediction, predict()
      
      1._at        : xhours          =           0
      2._at        : xhours          =          40
      3._at        : xhours          =          80
      4._at        : xhours          =         120
      5._at        : xhours          =         160
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               _at |
                1  |   1.648488   .0089328   184.54   0.000      1.63098    1.665996
                2  |   1.677863   .0019792   847.75   0.000     1.673984    1.681743
                3  |   1.707239    .010552   161.79   0.000     1.686557    1.727921
                4  |   1.736615   .0200519    86.61   0.000     1.697314    1.775916
                5  |    1.76599   .0295993    59.66   0.000     1.707977    1.824004
      ------------------------------------------------------------------------------
      
      
      . sum p00 p40 p80 p120 p160
      
          Variable |       Obs        Mean    Std. Dev.       Min        Max
      -------------+--------------------------------------------------------
               p00 |     28443    1.648488    .1219322   1.374862   1.957044
               p40 |     28443    1.677863    .1219322   1.404237    1.98642
               p80 |     28443    1.707239    .1219322   1.433613   2.015796
              p120 |     28443    1.736615    .1219322   1.462989   2.045171
              p160 |     28443     1.76599    .1219322   1.492364   2.074547
      BUT, when I switch the predict option to xbu,

      Code:
      webuse nlswork, clear
      clonevar xhours = hours
      xtreg ln_wage xhours age , fe
      drop if !e(sample)
      * Margins won't work with xbu option
      * margins, at(xhours = (0 (40) 160) ) predict(xbu)
      replace xhours = 0
      predict p00, xbu
      replace xhours = 40
      predict p40, xbu
      replace xhours = 80
      predict p80, xbu
      replace xhours = 120
      predict p120, xbu
      replace xhours = 160
      predict p160, xbu
      sum p00 p40 p80 p120 p160
      
      
      . sum p00 p40 p80 p120 p160
      
          Variable |       Obs        Mean    Std. Dev.       Min        Max
      -------------+--------------------------------------------------------
               p00 |     28443    1.675335    .3892708          0   3.930026
               p40 |     28443    1.675335    .3892708          0   3.930026
               p80 |     28443    1.675335    .3892708          0   3.930026
              p120 |     28443    1.675335    .3892708          0   3.930026
              p160 |     28443    1.675335    .3892708          0   3.930026
      In other words, with xbu, no matter what value you plug in for hours, the prediction is exactly the same. I don't really understand why that is, but it does make clear why xbu isn't a legit option for margins after xtreg.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Richard: I played a bit around with your example. The behavior of predict, xbu is totally weired when an independent variable is changed in a counterfactual way. The results are not as one would expect. I have no explanation for it.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Here is what I think, after a little research. Starting from

          The results are not as one would expect.
          I asked myself: What would one expect? To answer this question, we need to think about what the \(u_i\) are? Well, these are the residuals for a given panel-unit, right? The residuals depend - per definition of a residual - on the predicted values (and, maybe even more important, on the observed values - see next post). We define

          \[
          u_i = \bar{y_i} - \bar{xb_i}
          \]

          This means, the \(u_i\) account for the (mean) deviation of each panel-unit from its' own (mean) predicted value - regardless of what that predicted value might be. The (predicted) \(u_i\) will change with the values we plug in for \(x\), making sure that \(\bar{xb_i} + u_i = xbu_i\) is constant across different \(x\) values. You can try that using Richard's code and predict the \(u_i\) for each level of \(x\).

          There is another component in the error term that I have omitted from the first equation, namely \(\epsilon_{it}\). We should expect the observed values \(y_{it}\) to equal the predicted values \(xb_{it} + u_i + \epsilon_{it}\). After all, this is exactly the model we have estimated. Now, given that \(\epsilon_{it}\) is assumed to have mean 0, we should also expect the mean of the observed values \(y_{it}\) to be equal to the mean of the predicted values, once we account for the unit effects in that prediction.

          I hope this sheds some light on this issue.

          Best
          Daniel
          Last edited by daniel klein; 15 Aug 2014, 04:58.

          Comment


          • #6
            Hm, on second thought let's state the above the other way round.

            The thing to stress here is not so much the dependence of \(u_i\) on \(xb_{it}\), but their dependence on the actually observed values \(y_{it}\). This is probably what makes Sebastian wonder about predicted values for counterfactual \(x\) values. There is no way we can get predicted residuals independent of what we have observed (i.e. counterfactual), because residuals are defined as the deviance between what we have observed and what we predict.

            Best
            Daniel

            Comment


            • #7
              Daniel: You hit the nail on the head. I was incorrectly expecting that the prediction of the fixed effects \(u_i\) remains the same based on the original model, even if we afterwards change the values of some regressors. But you are perfectly right. predict computes first the xb prediction which, of course, depends on the changes made to \(x_{it}\) in the meantime. The predictions of the error components are then based on this prediction of \(x_{it}b\), as you explained.

              It is then also clear why the xbu predictions are always the same in Richard's example because he replaces xhours by a constant number without variation over time. The difference in the mean when xhours equals 40 or 60 for all observations, for example, is then just picked up by an accordingly changed prediction of \(u_i\). Consequently, the overall prediction of \(x_{it}b + u_i\) remains unchanged.
              Last edited by Sebastian Kripfganz; 15 Aug 2014, 06:03.
              https://www.kripfganz.de/stata/

              Comment


              • #8
                Thanks for correcting my sloppy notation. Of course the \(b\) do neither vary over \(_i\) nor \(_t\) as my original post might incorrectly suggest.

                Best
                Daniel

                Comment


                • #9
                  This is pure improvisation, but does this do anything useful (and more importantly, valid)? Basically it holds the ui constant rather than let it change as we plug in different values for x. Since the mean of u is 0 it produces the same mean values as the original xb predictions, but there is more variability across cases. Somebody who really understands this stuff may shudder, but I thought I would toss it out.

                  Code:
                  webuse nlswork, clear
                  clonevar xhours = hours
                  xtreg ln_wage xhours age , fe
                  set type double
                  drop if !e(sample)
                  predict u, u
                  replace xhours = 0
                  predict p00, xb
                  replace p00 = p00 + u
                  replace xhours = 40
                  predict p40, xb
                  replace p40 = p40 + u
                  replace xhours = 80
                  predict p80, xb
                  replace p80 = p80 + u
                  replace xhours = 120
                  predict p120, xb
                  replace p120 = p120 + u
                  replace xhours = 160
                  predict p160, xb
                  replace p160 = p160 + u
                  sum p00 p40 p80 p120 p160 u
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://www3.nd.edu/~rwilliam

                  Comment


                  • #10
                    Richard: I think your code is the correct way to achieve the purpose of the opening post.
                    https://www.kripfganz.de/stata/

                    Comment


                    • #11
                      Richard's code might be what Sebastian van Baal originally had in mind. Richard's results could (in principle) be replicated using margins, after the equivalent model using the LSDV estimator. To verify we need smaller sample to avoid matsize limits, though. Here is Richard's code with a few little extras:

                      Code:
                      webuse nlswork, clear
                      
                      // we only keep 20 panel-units
                      keep if inrange(id, 1, 20)
                      
                      clonevar xhours = hours
                      xtreg ln_wage xhours age , fe
                      set type double
                      drop if !e(sample)
                      predict u, u
                      replace xhours = 0
                      predict p00, xb
                      replace p00 = p00 + u
                      replace xhours = 40
                      predict p40, xb
                      replace p40 = p40 + u
                      replace xhours = 80
                      predict p80, xb
                      replace p80 = p80 + u
                      replace xhours = 120
                      predict p120, xb
                      replace p120 = p120 + u
                      replace xhours = 160
                      predict p160, xb
                      replace p160 = p160 + u
                      sum p00 p40 p80 p120 p160 u
                      
                      // now we run the equivalent LSDV regression
                      qui reg ln_wage hours age i.id
                      
                      // and calculate the desired predicted values
                      margins ,at(hours = (0(40)160)) asobserved
                      This might clarify what Richard is doing here.

                      Best
                      Daniel
                      Last edited by daniel klein; 15 Aug 2014, 07:34. Reason: removed some additional stuff from the code, as it was not needed for the example

                      Comment


                      • #12
                        Note that my revised code, and the original code with xb only, and Daniel's LSDV model, all produce the same estimates for the marginal effects. (The revised code does add more variability to the individual predictions). So, if all you want are the marginal effects, I am not sure what the various revised codes accomplish; but if you wanted to do something with the predicted values for cases maybe they would be helpful.

                        Tweaking Daniel's tweak of my tweak, you see that the marginal effects for xtreg are the same as the marginal effects for LSDV

                        Code:
                        webuse nlswork, clear
                        
                        // we only keep 20 panel-units
                        keep if inrange(id, 1, 20)
                        
                        clonevar xhours = hours
                        xtreg ln_wage xhours age , fe
                        set type double
                        drop if !e(sample)
                        predict u, u
                        replace xhours = 0
                        predict p00, xb
                        replace p00 = p00 + u
                        replace xhours = 40
                        predict p40, xb
                        replace p40 = p40 + u
                        replace xhours = 80
                        predict p80, xb
                        replace p80 = p80 + u
                        replace xhours = 120
                        predict p120, xb
                        replace p120 = p120 + u
                        replace xhours = 160
                        predict p160, xb
                        replace p160 = p160 + u
                        sum p00 p40 p80 p120 p160 u
                        replace xhours = hours
                        margins ,at(xhours = (0(40)160)) asobserved
                        
                        // now we run the equivalent LSDV regression
                        qui reg ln_wage hours age i.id
                        
                        // and calculate the desired predicted values
                        margins ,at(hours = (0(40)160)) asobserved
                        Here are the margins after xtreg, followed by the margins after lsdv:

                        Code:
                        * xtreg margins
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                                 _at |
                                  1  |   1.868863   .1072054    17.43   0.000     1.658744    2.078982
                                  2  |   1.924977   .0229909    83.73   0.000     1.879916    1.970039
                                  3  |   1.981091   .1276181    15.52   0.000     1.730965    2.231218
                                  4  |   2.037206   .2424178     8.40   0.000     1.562076    2.512336
                                  5  |    2.09332   .3577269     5.85   0.000     1.392188    2.794452
                        ------------------------------------------------------------------------------
                        
                        * LSDV margins
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                                 _at |
                                  1  |   1.868863   .1072054    17.43   0.000     1.657112    2.080614
                                  2  |   1.924977   .0229909    83.73   0.000     1.879566    1.970389
                                  3  |   1.981091   .1276181    15.52   0.000     1.729022    2.233161
                                  4  |   2.037206   .2424178     8.40   0.000     1.558385    2.516027
                                  5  |    2.09332   .3577269     5.85   0.000     1.386742    2.799898
                        ------------------------------------------------------------------------------
                        In short, while Sebastian's initial question seemed very reasonable to me, I am now thinking it doesn't gain anything to add in the fixed effect, because just using margins with xb will yield the same estimates of the marginal effects. That is, the fixed effect has a mean of 0, so including it doesn't change any of the results from margins.

                        Getting back to my first post, I wish the margins help was customized for each estimation command, and/or that there was a FAQ that explained why some prediction options aren't allowed. Super smart people might understand why some options are legit and others are not, but alas, I am not one of those people.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        StataNow Version: 19.5 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          The essential point here is that we are considering a linear model without interaction effects or other kinds of nonlinearities. To be clear about the terminology, we are talking about (counterfactual) predictions. In a fixed effects setup, I would generally use predict with the option xbu instead of just xb because it is the nature of this model that these effects are "fixed" for each unit (so, why ignoring them?), but this might depend on the particular purpose of the predictions. For counterfactual predictions, I would then follow your coding above and keep the estimate of \(u_i\) fixed from the initial regression. In any case, I find it misleading to talk about marginal effects here. In my understanding, marginal effects are just the derivatives with respect to the regressors. In the linear model, they do not depend on the values of the regressors but would just equal the coefficients \(b\).
                          Last edited by Sebastian Kripfganz; 15 Aug 2014, 08:27.
                          https://www.kripfganz.de/stata/

                          Comment


                          • #14
                            Even though this might be an unnecessary post, let me thank you for the illuminating discussion. I'll admit freely that I am not sure anymore as to whether I really want what I wanted (counterfactual predictions). If I still do, valuable (and valid) options are on the table, and by now I have a somewhat better idea as to why Stata considers xbu inappropriate for margins.

                            Comment

                            Working...
                            X