random effect vs fixed effect

River Huang

Join Date: Mar 2016

Posts: 1908
#1

random effect vs fixed effect

06 Apr 2021, 17:02

Dear All, I have been confused by the following for a long time. Consider the data and code

Code:

webuse grunfeld, clear xtset company year // (1) RE xtreg invest mvalue kstock i.year, re robust // (2) FE xtreg invest mvalue kstock i.year, fe robust // (3) xtreg invest mvalue kstock i.company i.year, re robust

It is clear that regression (1) is the RE estimator and (2) is the FE estimator. However, I often see people doing (3), and wonder if this is correct or wrong (or the theories/assumptions behind the method)? I notice that the estimates of key variables are the same for (2) and (3), but their standard errors are different.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

07 Apr 2021, 01:25

River:
very interesting topic I've never challenged myself with.
Admittedly, I'm more familiat with N>T panel datasets; that said, it seems that (as expected) the third code removes the panel.wise effect, as the -u- statistics is 0 (the R-sq between is also 1.000).
On a different tone, being -grunfeld- a T>N panel dataset, I wonder whether -xtreg- is actually the way to go instead of -xtregar- or -xtgls-.
It is also interesting to notice that imposing a similar code on a N>T panel dataset makes Stata humming forever:

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta" (National Longitudinal Survey. Young Women 14-26 years of age in 1968) . xtreg ln_wage i.year i.idcode age if idcode<=2

Kind regards,
Carlo
(Stata 19.0)
Comment
Dario Maimone Ansaldo Patti

Join Date: Aug 2014

Posts: 505
#3

07 Apr 2021, 02:02

Hi River Huang

the third model allows you to estimate a panel where the cross-section heterogeneity is captured by random shocks, while time heterogeneity is captured by fixed effects. I do not think xtreg can be used to estimate a two-ways random effect model. I think that other softwares like Eviews allows you to include both cross and period random shocks. I googled a little bit and found the following post which suggests to use xtmixed (or the new one mixed) command. I hope it helps.
1 like
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10228
#4

07 Apr 2021, 02:55

Dario is correct. Part of the confusion arises from the fact that you can add additional fixed effects in xtreg, fe using dummies. This is not the case for the random effects (error components) model, i.e., you cannot add additional random effects using dummies. The best way to understand what xtreg, re is doing if one adds dummies is to view the syntax for mixed, where xtreg, re can be considered a special case (two-level model).

mixed depvar fe_equation [|| re_equation] [|| re_equation ...] [, options]

where the syntax of fe_equation is

[indepvars] [if] [in] [weight] [, fe_options]

and the syntax of re_equation is one of the following:

for random coefficients and intercepts

levelvar: [varlist] [, re_options]

So, you have a fixed effects equation and a random effects equation. In this sense, therefore, you are able to estimate a fixed effects equation using random effects estimators, an example being #1 here and #5 in the following: https://www.statalist.org/forums/for...ce-using-mixed

#3

. I do not think xtreg can be used to estimate a two-ways random effect model.

mixed can do that. Here is an example

Code:

webuse grunfeld, clear *2WFE (company and time) xtset company year xtreg invest mvalue kstock i.year, fe *2WRE (company and time) mixed invest mvalue kstock || _all: R.company || _all: R.year,mle
1 like
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#5

07 Apr 2021, 03:31

Start from the beginning and consider: y(i, t) = b0 + b1.x(i, t) +e(i, t) where e is the residual or error term, i refers to the individual or group (in Stata parlance) and t to the another dimension, mostly time.
Now introduce a random variable, u(i), to capture unobserved individual heterogeneity which does not change over t. The model becomes y(i, t) = b0 + b1.x(i, t) + u(i) + e(i, t)
If we omit u we have an omitted variable bias problem. So u is included in the error term: v(i, t) = u(i) + e(i, t)
Two cases arise:
(i) u is correlated with x, which leads to an endogeneity problem. So we have to find a way to "eliminate" it. This we do with the FE or within transformation
(ii) u is not correlated with x, so we leave it in the error term. Now the error term has a specific structure. Under certain assumptions, this gives is the RE model
The assumptions leading tot the RE model are very restrictive.
To allow for misspectification of the variance of the error term we robustify.

An alternative to the FE transformation is to introduce individual dummy (indicator) variables into the estimated equation. This is known as the Least Squares Dummy Variables (LSDV) model. For reasons too long to explain here, one should not interpret the coefficient attached to these dummy (indicator) variables.
If the number of individuals in large, the LSDV model is impractical.

The use of the terms FE and RE is unfortunate but has historical origins and is too entrenched to attempt to change it now.

Just like with individual heterogeneity, we can introduce aggregate temporal effects which are unobservable, which vary over time but are the same for all individuals. The only way to introduce aggregate temporal effects in Stata is by means of time dummies (indicators).

On edit, I just saw Andrew's reference to the -mixed- command. I have not looked at it as yet.
1 like
Comment
Dario Maimone Ansaldo Patti

Join Date: Aug 2014

Posts: 505
#6

07 Apr 2021, 05:13

Andrew Musau indeed. What I meant is that you cannot use xtreg to run the estimation that River is asking for. But mixed can do it
1 like
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#7

07 Apr 2021, 06:53

Command (3) in the first post by River doesn't make sense. That was my point in #5 above, but probably made too indirectly
If one look carefully at the output to (3), one observes that

Code:

sigma_u | 0 sigma_e | 51.724523 rho | 0 (fraction of variance due to u_i)

The output is identical to:

Code:

reg invest mvalue kstock i.company i.year, cluster(company)

Compare the Root MSE of the reg command with sigma_e of the re command
This is why the coefficient estimates are the same as the FE model.

Last edited by Eric de Souza; 07 Apr 2021, 06:56.
1 like
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10228
#8

07 Apr 2021, 07:13

Eric de Souza, yes - in this case the variance of the individual effect is 0. However, the estimated model becomes a two-way fixed effects model because both fixed effects are now incorporated using dummies.
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#9

07 Apr 2021, 18:11

Dear @Carlo Lazzaro, @Dario Maimone Ansaldo Patti, @Andrew Musau, and @Eric de Souza: Thank you all for your help suggestions. I need time to digest.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#10

08 Apr 2021, 00:25

The answer is simple, and Eric de Souza gave it in #7, although I disagree with the way how Eric is phrasing it.

It is not that Model 3 does not make sense. It is that Model 2 and Model 3 and the model that Eric showed in #7

Code:

reg invest mvalue kstock i.company i.year, cluster(company)

are all algebraically equivalent. They are all the "fixed effects" model.

And they are all the fixed effects model because it does not matter whether in

Yit = b*Xit [+ Ui ] + Eit

we allow for the unit level random effect + Ui or we omit it, as long as the Xit includes a full set of dummy variables for the units. Inclusion of the full set of unit level dummies absorbs the random effect Ui, which is manifested in the fact that Eric showed in #7, that Var(Ui) = 0 when we include the full set of dummies.

I believe that the standard errors are slightly different because of some degrees of freedom adjustment.
1 like
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#11

08 Apr 2021, 02:37

@ Joro: when I said that model 3 does not make sense, I meant that once one introduces the two-way indicator variables into the equation, the RE transformation (and this is my problem: I don't know how to complete the sentence !)
I was going to work through the algebra but then got involved in other matters.
Comment

Announcement

random effect vs fixed effect

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment