What's difference between Fixed-effects model & Random-effects model in Panel data analysis

Chen Samulsion

Join Date: Jan 2018

Posts: 932
#1

What's difference between Fixed-effects model & Random-effects model in Panel data analysis

22 Feb 2018, 06:20

Dear Stata Users,

I have a question perplexing me these days. I want to know exactly the difference between Fixed-effects model & Random-effects model in Panel data analysis. Different textbooks & disciplines discuss this topic in different ways and emphasize different features. Some authors introduce the two models in framework of OLS or dummy variable regression, and others introduce them in framework of ANOVA and mixed linear models. Based on Wooldridge's famous textbook, i.e. Introductory Econometrics: A Modern Approach, I've grasped something fundamental. On the one hand, fixed-effects model utilizes only within-subject variations, it allows unobserved effects correlate with explanatory variables. On the other hand, random-effects model utilizes both within-subject & between-subject variations, it assumes that the unobserved effects is uncorrelated with explanatory variables. However, I still cannot fully understand the terminology, that is why the two models were named fixed and random separately? Were the terminology relates to different forms of model intercepts? In both models, it is commonly to use subject-specific parameters, {αi}, to represent the heterogeneity among subjects. I read in somewhere else that

In fixed-effects model, it represent subject-specific parameters as fixed, yet unknown, parameters. And in random-effects model, it represent subject-specific parameters {αi} as random variables.

What's the difference between fixed parameters and parameters as random variables? Can anyone gives an illustration (maybe with graphs?)? Thank you!
Tags: fixed effects, panel data, random effects, regression
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17734
#2

22 Feb 2018, 06:42

Chen:
the textbook you quoted is really comprehensive: so it's difficult (for me, at least) to add some better explanations to what is already covered there.
That said, you may benefit from reading Paul Allison's (brief) textbook on the same topic: https://uk.sagepub.com/en-gb/eur/fix...els/book226025.
I would also take a look at -xtreg- entry in Stata .pdf manual.
Last but not least, the helpfile of the user-written programme -xtoverid- (type -search xtoverid- from within Stata to get it), provides an interesting exlanation of the two specifications in terms of overidentifying restrictions.

Kind regards,
Carlo
(Stata 19.0)
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 932
#3

22 Feb 2018, 07:16

Dear Carlo, thank you for your informative reply. I've read -xtreg- parts in Stata manual and I must say that it provides good material as understandable as Wooldridge's. As I declared myself above, I can understand the computation process of demeaning, quasi-demeaning, etc. But I am confused about the different intercepts. I will follow your advice to further readings. Thank you very much!
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 932
#4

22 Feb 2018, 07:16

Additionally: Is fixed-effects model corresponds to no pooling, while random-effects model corresponds to partial pooling?

Last edited by Chen Samulsion; 22 Feb 2018, 07:32.
Comment
Festus odingo

Join Date: Feb 2018

Posts: 13
#5

22 Feb 2018, 07:32

chen, you can also grab "Microeconometrics Using Stata by cameron and trivedi, go through Linear panel-data models: Basics from page 229 through to page 235, its very comprehensive with practical examples together with carlos' insights i think all should be fine ! best
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#6

22 Feb 2018, 08:02

Do not try to interpret the terminology literally. This leads to the kind of quotation in your first post (#1 above). There was a time when fixed effects were considered, well, fixed and random effects random. A more neutral terminology is "unobserved effects" or "unobserved heterogeneity". Since they are unobserved they are treated as random variables. In the case of the FE model the effect is correlated with the (other) explanatory variables whereas in the RE model it is not correlated with the (other) explanatory variables. The terms FE and RE have stuck.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 932
#7

22 Feb 2018, 22:33

Dear Eric de Souza, thank you for your reply. Do you think the terms FE and RE are outdated and misleading? I notice some equivalence as follows:

Code:

xtset id year xtreg depvar indepvars, mle //random-effect model /*Is Equivalent To*/ mixed depvar indepvars || id: //random-intercept model

This equivalence means that randome-effect model (panel data) can be regarded as two-level mixed model, with the first level represents observations within subjects and the second level represents subjects. Thus the random-effect model shows us how much variation is at observation level (within-subject) and how much variation is at subject level (between-subject). In this model the term "random" means each subject has its own intercept and this intercept involves a random part of the model. And can I say that in fixed-effect model the unobserved heterogeneity is wholly absorbed in subject-specific intercepts which are correlated with explanatory variables, in contrast, in random-effect model the unobserved heteorogeneity is partly absorbed (thus some part is left and that is exactly the random effect part {μ_i} in model) in intercepts which are uncorrelated with explanatory variables?
Here is an useful link:
http://www.bris.ac.uk/cmm/learning/v...ntercepts.html
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17734

23 Feb 2018, 00:27

Chen:
waht you report is valid if you impose -mle- as an option for -xtreg-; otherwise, coefficients are slightly different, as you can see from the following toy-example:

Code:

. use "http://www.stata-press.com/data/r15/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtset idcode year
       panel variable:  idcode (unbalanced)
        time variable:  year, 68 to 88, but with gaps
                delta:  1 unit

. xtreg ln_wage tenure i.race

Random-effects GLS regression                   Number of obs     =     28,101
Group variable: idcode                          Number of groups  =      4,699

R-sq:                                           Obs per group:
     within  = 0.0972                                         min =          1
     between = 0.2079                                         avg =        6.0
     overall = 0.1569                                         max =         15

                                                Wald chi2(3)      =    3532.05
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      tenure |   .0376405   .0006448    58.37   0.000     .0363767    .0389043
             |
        race |
      black  |  -.1345322   .0120866   -11.13   0.000    -.1582215   -.1108429
      other  |   .1039944   .0504227     2.06   0.039     .0051677    .2028211
             |
       _cons |    1.59266   .0066729   238.68   0.000     1.579581    1.605738
-------------+----------------------------------------------------------------
     sigma_u |  .33623102
     sigma_e |  .30357621
         rho |  .55090591   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. mixed ln_wage tenure i.race || idcode:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -10994.645 
Iteration 1:   log likelihood = -10994.645 

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =     28,101
Group variable: idcode                          Number of groups  =      4,699

                                                Obs per group:
                                                              min =          1
                                                              avg =        6.0
                                                              max =         15

                                                Wald chi2(3)      =    3537.28
Log likelihood = -10994.645                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      tenure |   .0376721    .000645    58.41   0.000      .036408    .0389362
             |
        race |
      black  |   -.134563   .0120309   -11.18   0.000    -.1581433   -.1109828
      other  |   .1039207   .0501976     2.07   0.038     .0055352    .2023063
             |
       _cons |   1.592605   .0066445   239.69   0.000     1.579582    1.605628
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
idcode: Identity             |
                  var(_cons) |   .1122818   .0028764      .1067834    .1180632
-----------------------------+------------------------------------------------
               var(Residual) |   .0927218     .00086      .0910515    .0944227
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 11435.11      Prob >= chibar2 = 0.0000

. xtreg ln_wage tenure i.race, mle

Fitting constant-only model:
Iteration 0:   log likelihood = -12664.968
Iteration 1:   log likelihood = -12650.767
Iteration 2:   log likelihood = -12650.626
Iteration 3:   log likelihood = -12650.626

Fitting full model:
Iteration 0:   log likelihood = -11114.889
Iteration 1:   log likelihood = -10995.292
Iteration 2:   log likelihood = -10994.645
Iteration 3:   log likelihood = -10994.645

Random-effects ML regression                    Number of obs     =     28,101
Group variable: idcode                          Number of groups  =      4,699

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =          1
                                                              avg =        6.0
                                                              max =         15

                                                LR chi2(3)        =    3311.96
Log likelihood  = -10994.645                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      tenure |   .0376721   .0006486    58.08   0.000     .0364008    .0389434
             |
        race |
      black  |   -.134563   .0120311   -11.18   0.000    -.1581436   -.1109825
      other  |   .1039207   .0501979     2.07   0.038     .0055347    .2023068
             |
       _cons |   1.592605   .0066455   239.65   0.000      1.57958     1.60563
-------------+----------------------------------------------------------------
    /sigma_u |   .3350847    .004292                      .3267773    .3436033
    /sigma_e |   .3045026   .0014121                      .3017475    .3072828
         rho |   .5477064   .0069466                      .5340659    .5612908
------------------------------------------------------------------------------
LR test of sigma_u=0: chibar2(01) = 1.1e+04            Prob >= chibar2 = 0.000

.

An interesting textbook on multilevel models and their equivalence with -xtreg, mle- is: https://www.stata.com/bookstore/mult...lain-language/

Kind regards,
Carlo
(Stata 19.0)

Comment

daniel klein

Join Date: Mar 2014

Posts: 3886
#9

23 Feb 2018, 00:38

Originally posted by Chen Samulsion View Post

And can I say that in fixed-effect model the unobserved heterogeneity is wholly absorbed in subject-specific intercepts which are correlated with explanatory variables, in contrast, in random-effect model the unobserved heteorogeneity is partly absorbed (thus some part is left and that is exactly the random effect part {μ_i} in model) in intercepts which are uncorrelated with explanatory variables?

I would tend to say that the statement goes in the right direction but it seems to mix assumptions with model properties. In the FE (within-variance estimator) the fixed-effects are not necessarily correlated with the (time invariant!) explanatory variables; there is merely no need to assume that they are not. In other words, the coefficients for the time-varying variables will be consistent even if the time-invariant (observed or unobserved) predictors are correlated with the individual fixed-effects (intercepts, if you want). This is because both get wiped out by the within-transformation (de-meaning). Basically, in the de-meaned model there are no intercepts, just like the intercept is 0 in a plain vanilla linear regression where all variables have been centered at their means.

In contrast, in the RE model, the time-invariant (and time varying) predictors are assumed to be uncorrelated with the individual intercepts; this assumption might not hold true, in which case the estimates will be biased.

The really misleading and unfortunate mix in terminology is the term "fixed-effects" which leads some researchers to believe that the mixed (multi-level, hierarchical) model, somehow controls for (unobserved) time-invariant heterogeneity; it does not! The term "fixed-effect" as used in the multi-level framework has nothing to do with the sources of variance that are used in the estimation.

Best
Daniel

[Edit]
Chen has correctly pointed out that panel-data is the same as multilevel-data, where observations/occasions are at level one and subjects (panel-units) at level two. It might be worth to point out the obvious: the reverse is true as well. A dataset that has students (level 1) nested in schools (level 2) can be thought of as a school-panel dataset. This means that we can estimate school-fixed effects, controlling for (unobserved) predictors that are constant within schools. However, this is not what [xt]mixed model is doing!
[/Edit]

Last edited by daniel klein; 23 Feb 2018, 00:52.
2 likes
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 932
#10

23 Feb 2018, 07:04

Dear daniel klein, thank you very much! I got your point that the "fixed-effect" in panel data analysis is not same as "fixed effect" in multilevel model. Just as what Rebecca Pillinger has said that

(In multilevel model framework) The random intercept model has two parts. It's got a fixed part (which is the intercept and the coefficient of the explanatory variable times the explanatory variable) and it's got a random part, so that's this u_j + e_ij at the end.
So the parameters that we estimate for the fixed part are the coefficients β0, β1 and so on and the parameters that we estimate for the random part are the variances, σ ²u and σ ²e.

I should learn more.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 932
#11

25 Feb 2018, 05:08

Crossover: material I read before opening this thread, author is Andrew Gelman:
http://andrewgelman.com/2015/03/22/no-fixed-random/
and maybe the myth and its answer lies here:

In multilevel modeling terms, ……“fixed effects” are equal to each other and estimated using complete pooling, whereas in econometric terminology, “fixed effects” vary by group and are estimated using no pooling.

Last edited by Chen Samulsion; 25 Feb 2018, 05:21.
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#12

25 Feb 2018, 06:26

The conclusion seems to be to look at the mathematical structure of the model. The same term may mean different things in different frameworks.
Comment

Announcement

What's difference between Fixed-effects model & Random-effects model in Panel data analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment