Fixed effects

Sabrina Solanki

Join Date: Jul 2016

Posts: 31
#1

Fixed effects

06 Sep 2016, 14:23

I ran a regression with an individual fixed effects specification. 60% of my sample are observed more than once. I have been told that it is those observations (the 60%) that are included in my fixed effects estimate. But, when I actually run the regression in stata, the reported n reflects my full sample. Why does stata include the 40% of my sample that are only observed once?

Thanks.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

06 Sep 2016, 14:31

Without seeing the actual commands you used (starting with your -xtreg- command and everything following that through the regression itself) and the actual output you got from Stata, it is impossible to answer your question. Please post those by copy/pasting directly from Stata's Results window or your log file into a code block here. (See FAQ #12 if you are not familiar with code blocks.)

That said, your expectation that individuals observed only once are excluded would be true for a fixed effects logit model. But it is not true for fixed-effects linear regression. Singleton observations are included in fixed-effects linear regression.
Comment
Sabrina Solanki

Join Date: Jul 2016

Posts: 31
#3

10 Sep 2016, 13:57

Thank you!
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#4

10 Sep 2016, 16:49

Originally posted by Clyde Schechter View Post

That said, your expectation that individuals observed only once are excluded would be true for a fixed effects logit model. But it is not true for fixed-effects linear regression. Singleton observations are included in fixed-effects linear regression.

Can someone explain how that actually works, or how does Stata make it work? This is my thought process which is why I ask The fixed-effects (within) estimator is equivalent as the regression of the group-demeaned variables. In this approach groups with just one observations would now have zeros for all (demeaned) variables, since in fact there's no within variation. What is the fixed effect for those groups?

Alfonso Sanchez-Penalver
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#5

10 Sep 2016, 18:41

The complete equations for estimating all model parameters with -xtreg, fe- are shown at http://www.stata.com/support/faqs/st...effects-model/. You will see that the method used does not suppose or require that any group exhibit within-variance. Everything is calculated from demeaned data, group means, and grand means.
1 like
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#6

10 Sep 2016, 21:11

Thanks Clyde.

Alfonso Sanchez-Penalver
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#7

11 Sep 2016, 06:10

So for all those singleton observations the model is setting the function of grand averages. Letting m denote the grand average, the function for the singletons is

Code:

my = a + mx b + noise

following the explanation, since the deviations from the group averages are all zero. Remember that OLS will set

Code:

a = my - mx b.

So even though they're included in the estimation the singletons really have no explanatory power, and the estimation would be the same as if they were not in it. A question then rises, what is the appropriate number of degrees of freedom? Can we say that we're using the full sample when, in reality, we're not?

Last edited by Alfonso Sánchez-Peñalver; 11 Sep 2016, 06:16.

Alfonso Sanchez-Penalver
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#8

11 Sep 2016, 10:58

Your observation is (mostly) correct: if you drop the singleton groups, the estimates of the coefficients do not change (although the constant term does, and so do the estimates of sigma_u and sigma_e.)

But the degrees of freedom is not a problem here. df = #obs - #groups - 1. If you drop the singleton groups, then #obs and #groups both decrease by the same amount and the df comes out the same either way.
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#9

11 Sep 2016, 11:51

Ah yes you're right about the degrees of freedom. The intercept would change if you use the xtreg, fe command, but that is because it's basically artificial. Notice that in a fixed effect estimation if you consider the intercept you should be considering the panel-specific intercept. For the panels estimated that would still be the same. The intercept for the singletons can still be calculated as the difference between the grand average in the explained variable and the sum of the marginal effects at the grand averages of the explanatory variables.

The only issue may be with comparisons with the random effects model, if you want to do a Hausman specification test and you have two estimations with different observations. That's my best guess of why to do the estimation with all of the observations, even though the singletons don't add any value to the model. Thanks Clyde!

Alfonso Sanchez-Penalver
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#10

11 Sep 2016, 12:06

Also notice another thing. Let there be g panels out of which there are s singletons. The number of intercepts we are estimating is g - s + 1, because the intercept for all the singletons is the same. So the degrees of freedom should really be n - g + s - 2, not n - g - 1, so we really do have an issue with the degrees of freedom.

Alfonso Sanchez-Penalver
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30117

#11

11 Sep 2016, 12:18

I don't think so. The intercepts for the singletons are not all the same, unless they also happen to have the same values for the dependent variable. Try this:

Code:

. webuse grunfeld

. by company (year), sort: drop if company > 8 & _n > 1
(38 observations deleted)

. tab company

    company |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         20       12.35       12.35
          2 |         20       12.35       24.69
          3 |         20       12.35       37.04
          4 |         20       12.35       49.38
          5 |         20       12.35       61.73
          6 |         20       12.35       74.07
          7 |         20       12.35       86.42
          8 |         20       12.35       98.77
          9 |          1        0.62       99.38
         10 |          1        0.62      100.00
------------+-----------------------------------
      Total |        162      100.00

. xtreg mvalue kstock, fe

Fixed-effects (within) regression               Number of obs     =        162
Group variable: company                         Number of groups  =         10

R-sq:                                           Obs per group:
     within  = 0.1407                                         min =          1
     between = 0.4987                                         avg =       16.2
     overall = 0.2129                                         max =         20

                                                F(1,151)          =      24.72
corr(u_i, Xb)  = 0.3656                         Prob > F          =     0.0000

------------------------------------------------------------------------------
      mvalue |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      kstock |    .551888   .1110068     4.97   0.000      .332561     .771215
       _cons |   1119.766   44.13106    25.37   0.000     1032.572     1206.96
-------------+----------------------------------------------------------------
     sigma_u |  1260.6442
     sigma_e |  361.49714
         rho |  .92401891   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(9, 151) = 188.59                    Prob > F = 0.0000

. predict u, u

. tabstat u, by(company)

Summary for variables: u
     by categories of: company 

 company |      mean
---------+----------
       1 |  2856.216
       2 |  689.3324
       3 |  600.7158
       4 | -493.4693
       5 | -1156.935
       6 | -757.4543
       7 |  -1143.79
       8 | -496.1194
       9 | -918.5715
      10 | -1051.339
---------+----------
   Total |  9.04e-06
--------------------

.

Comment

Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#12

11 Sep 2016, 12:27

Oh yes, that's right. I was thinking that since the equation for the singletons was

Code:

my = a + mx b + noise

they would all have the same vi, but you're right it depends on their respective yi, since that is their group mean. Sorry for the confusion.

Alfonso Sanchez-Penalver
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment