Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • random effect vs fixed effect

    Dear All, I have been confused by the following for a long time. Consider the data and code
    Code:
    webuse grunfeld, clear
    
    xtset company year
    
    // (1) RE
    xtreg invest mvalue kstock i.year, re robust
    // (2) FE
    xtreg invest mvalue kstock i.year, fe robust
    // (3)
    xtreg invest mvalue kstock i.company i.year, re robust
    It is clear that regression (1) is the RE estimator and (2) is the FE estimator. However, I often see people doing (3), and wonder if this is correct or wrong (or the theories/assumptions behind the method)? I notice that the estimates of key variables are the same for (2) and (3), but their standard errors are different.
    Ho-Chuan (River) Huang
    Stata 19.0, MP(4)

  • #2
    River:
    very interesting topic I've never challenged myself with.
    Admittedly, I'm more familiat with N>T panel datasets; that said, it seems that (as expected) the third code removes the panel.wise effect, as the -u- statistics is 0 (the R-sq between is also 1.000).
    On a different tone, being -grunfeld- a T>N panel dataset, I wonder whether -xtreg- is actually the way to go instead of -xtregar- or -xtgls-.
    It is also interesting to notice that imposing a similar code on a N>T panel dataset makes Stata humming forever:
    Code:
    . use "https://www.stata-press.com/data/r16/nlswork.dta"
    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
    . xtreg ln_wage i.year i.idcode age if idcode<=2
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi River Huang

      the third model allows you to estimate a panel where the cross-section heterogeneity is captured by random shocks, while time heterogeneity is captured by fixed effects. I do not think xtreg can be used to estimate a two-ways random effect model. I think that other softwares like Eviews allows you to include both cross and period random shocks. I googled a little bit and found the following post which suggests to use xtmixed (or the new one mixed) command. I hope it helps.

      Comment


      • #4
        Dario is correct. Part of the confusion arises from the fact that you can add additional fixed effects in xtreg, fe using dummies. This is not the case for the random effects (error components) model, i.e., you cannot add additional random effects using dummies. The best way to understand what xtreg, re is doing if one adds dummies is to view the syntax for mixed, where xtreg, re can be considered a special case (two-level model).


        mixed depvar fe_equation [|| re_equation] [|| re_equation ...] [, options]

        where the syntax of fe_equation is

        [indepvars] [if] [in] [weight] [, fe_options]

        and the syntax of re_equation is one of the following:

        for random coefficients and intercepts

        levelvar: [varlist] [, re_options]
        So, you have a fixed effects equation and a random effects equation. In this sense, therefore, you are able to estimate a fixed effects equation using random effects estimators, an example being #1 here and #5 in the following: https://www.statalist.org/forums/for...ce-using-mixed

        #3
        . I do not think xtreg can be used to estimate a two-ways random effect model.
        mixed can do that. Here is an example

        Code:
        webuse grunfeld, clear
        *2WFE (company and time)
        xtset company year
        xtreg invest mvalue kstock i.year, fe
        *2WRE (company and time)
        mixed invest mvalue kstock || _all: R.company || _all: R.year,mle

        Comment


        • #5
          Start from the beginning and consider: y(i, t) = b0 + b1.x(i, t) +e(i, t) where e is the residual or error term, i refers to the individual or group (in Stata parlance) and t to the another dimension, mostly time.
          Now introduce a random variable, u(i), to capture unobserved individual heterogeneity which does not change over t. The model becomes y(i, t) = b0 + b1.x(i, t) + u(i) + e(i, t)
          If we omit u we have an omitted variable bias problem. So u is included in the error term: v(i, t) = u(i) + e(i, t)
          Two cases arise:
          (i) u is correlated with x, which leads to an endogeneity problem. So we have to find a way to "eliminate" it. This we do with the FE or within transformation
          (ii) u is not correlated with x, so we leave it in the error term. Now the error term has a specific structure. Under certain assumptions, this gives is the RE model
          The assumptions leading tot the RE model are very restrictive.
          To allow for misspectification of the variance of the error term we robustify.

          An alternative to the FE transformation is to introduce individual dummy (indicator) variables into the estimated equation. This is known as the Least Squares Dummy Variables (LSDV) model. For reasons too long to explain here, one should not interpret the coefficient attached to these dummy (indicator) variables.
          If the number of individuals in large, the LSDV model is impractical.

          The use of the terms FE and RE is unfortunate but has historical origins and is too entrenched to attempt to change it now.

          Just like with individual heterogeneity, we can introduce aggregate temporal effects which are unobservable, which vary over time but are the same for all individuals. The only way to introduce aggregate temporal effects in Stata is by means of time dummies (indicators).

          On edit, I just saw Andrew's reference to the -mixed- command. I have not looked at it as yet.

          Comment


          • #6
            Andrew Musau indeed. What I meant is that you cannot use xtreg to run the estimation that River is asking for. But mixed can do it

            Comment


            • #7
              Command (3) in the first post by River doesn't make sense. That was my point in #5 above, but probably made too indirectly
              If one look carefully at the output to (3), one observes that
              Code:
                   sigma_u |          0
                   sigma_e |  51.724523
                       rho |          0   (fraction of variance due to u_i)
              The output is identical to:
              Code:
              reg invest mvalue kstock i.company i.year, cluster(company)
              Compare the Root MSE of the reg command with sigma_e of the re command
              This is why the coefficient estimates are the same as the FE model.
              Last edited by Eric de Souza; 07 Apr 2021, 06:56.

              Comment


              • #8
                Eric de Souza, yes - in this case the variance of the individual effect is 0. However, the estimated model becomes a two-way fixed effects model because both fixed effects are now incorporated using dummies.

                Comment


                • #9
                  Dear @Carlo Lazzaro, @Dario Maimone Ansaldo Patti, @Andrew Musau, and @Eric de Souza: Thank you all for your help suggestions. I need time to digest.
                  Ho-Chuan (River) Huang
                  Stata 19.0, MP(4)

                  Comment


                  • #10
                    The answer is simple, and Eric de Souza gave it in #7, although I disagree with the way how Eric is phrasing it.

                    It is not that Model 3 does not make sense. It is that Model 2 and Model 3 and the model that Eric showed in #7
                    Code:
                     
                     reg invest mvalue kstock i.company i.year, cluster(company)
                    are all algebraically equivalent. They are all the "fixed effects" model.

                    And they are all the fixed effects model because it does not matter whether in

                    Yit = b*Xit [+ Ui ] + Eit

                    we allow for the unit level random effect + Ui or we omit it, as long as the Xit includes a full set of dummy variables for the units. Inclusion of the full set of unit level dummies absorbs the random effect Ui, which is manifested in the fact that Eric showed in #7, that Var(Ui) = 0 when we include the full set of dummies.

                    I believe that the standard errors are slightly different because of some degrees of freedom adjustment.

                    Comment


                    • #11
                      @ Joro: when I said that model 3 does not make sense, I meant that once one introduces the two-way indicator variables into the equation, the RE transformation (and this is my problem: I don't know how to complete the sentence !)
                      I was going to work through the algebra but then got involved in other matters.

                      Comment

                      Working...
                      X