Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects

    I ran a regression with an individual fixed effects specification. 60% of my sample are observed more than once. I have been told that it is those observations (the 60%) that are included in my fixed effects estimate. But, when I actually run the regression in stata, the reported n reflects my full sample. Why does stata include the 40% of my sample that are only observed once?

    Thanks.

  • #2
    Without seeing the actual commands you used (starting with your -xtreg- command and everything following that through the regression itself) and the actual output you got from Stata, it is impossible to answer your question. Please post those by copy/pasting directly from Stata's Results window or your log file into a code block here. (See FAQ #12 if you are not familiar with code blocks.)

    That said, your expectation that individuals observed only once are excluded would be true for a fixed effects logit model. But it is not true for fixed-effects linear regression. Singleton observations are included in fixed-effects linear regression.

    Comment


    • #3
      Thank you!

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        That said, your expectation that individuals observed only once are excluded would be true for a fixed effects logit model. But it is not true for fixed-effects linear regression. Singleton observations are included in fixed-effects linear regression.
        Can someone explain how that actually works, or how does Stata make it work? This is my thought process which is why I ask The fixed-effects (within) estimator is equivalent as the regression of the group-demeaned variables. In this approach groups with just one observations would now have zeros for all (demeaned) variables, since in fact there's no within variation. What is the fixed effect for those groups?
        Alfonso Sanchez-Penalver

        Comment


        • #5
          The complete equations for estimating all model parameters with -xtreg, fe- are shown at http://www.stata.com/support/faqs/st...effects-model/. You will see that the method used does not suppose or require that any group exhibit within-variance. Everything is calculated from demeaned data, group means, and grand means.

          Comment


          • #6
            Thanks Clyde.
            Alfonso Sanchez-Penalver

            Comment


            • #7
              So for all those singleton observations the model is setting the function of grand averages. Letting m denote the grand average, the function for the singletons is
              Code:
              my = a + mx b + noise
              following the explanation, since the deviations from the group averages are all zero. Remember that OLS will set
              Code:
              a = my - mx b.
              So even though they're included in the estimation the singletons really have no explanatory power, and the estimation would be the same as if they were not in it. A question then rises, what is the appropriate number of degrees of freedom? Can we say that we're using the full sample when, in reality, we're not?
              Last edited by Alfonso Sánchez-Peñalver; 11 Sep 2016, 06:16.
              Alfonso Sanchez-Penalver

              Comment


              • #8
                Your observation is (mostly) correct: if you drop the singleton groups, the estimates of the coefficients do not change (although the constant term does, and so do the estimates of sigma_u and sigma_e.)

                But the degrees of freedom is not a problem here. df = #obs - #groups - 1. If you drop the singleton groups, then #obs and #groups both decrease by the same amount and the df comes out the same either way.

                Comment


                • #9
                  Ah yes you're right about the degrees of freedom. The intercept would change if you use the xtreg, fe command, but that is because it's basically artificial. Notice that in a fixed effect estimation if you consider the intercept you should be considering the panel-specific intercept. For the panels estimated that would still be the same. The intercept for the singletons can still be calculated as the difference between the grand average in the explained variable and the sum of the marginal effects at the grand averages of the explanatory variables.

                  The only issue may be with comparisons with the random effects model, if you want to do a Hausman specification test and you have two estimations with different observations. That's my best guess of why to do the estimation with all of the observations, even though the singletons don't add any value to the model. Thanks Clyde!
                  Alfonso Sanchez-Penalver

                  Comment


                  • #10
                    Also notice another thing. Let there be g panels out of which there are s singletons. The number of intercepts we are estimating is g - s + 1, because the intercept for all the singletons is the same. So the degrees of freedom should really be n - g + s - 2, not n - g - 1, so we really do have an issue with the degrees of freedom.
                    Alfonso Sanchez-Penalver

                    Comment


                    • #11
                      I don't think so. The intercepts for the singletons are not all the same, unless they also happen to have the same values for the dependent variable. Try this:

                      Code:
                      . webuse grunfeld
                      
                      . by company (year), sort: drop if company > 8 & _n > 1
                      (38 observations deleted)
                      
                      . tab company
                      
                          company |      Freq.     Percent        Cum.
                      ------------+-----------------------------------
                                1 |         20       12.35       12.35
                                2 |         20       12.35       24.69
                                3 |         20       12.35       37.04
                                4 |         20       12.35       49.38
                                5 |         20       12.35       61.73
                                6 |         20       12.35       74.07
                                7 |         20       12.35       86.42
                                8 |         20       12.35       98.77
                                9 |          1        0.62       99.38
                               10 |          1        0.62      100.00
                      ------------+-----------------------------------
                            Total |        162      100.00
                      
                      . xtreg mvalue kstock, fe
                      
                      Fixed-effects (within) regression               Number of obs     =        162
                      Group variable: company                         Number of groups  =         10
                      
                      R-sq:                                           Obs per group:
                           within  = 0.1407                                         min =          1
                           between = 0.4987                                         avg =       16.2
                           overall = 0.2129                                         max =         20
                      
                                                                      F(1,151)          =      24.72
                      corr(u_i, Xb)  = 0.3656                         Prob > F          =     0.0000
                      
                      ------------------------------------------------------------------------------
                            mvalue |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                            kstock |    .551888   .1110068     4.97   0.000      .332561     .771215
                             _cons |   1119.766   44.13106    25.37   0.000     1032.572     1206.96
                      -------------+----------------------------------------------------------------
                           sigma_u |  1260.6442
                           sigma_e |  361.49714
                               rho |  .92401891   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      F test that all u_i=0: F(9, 151) = 188.59                    Prob > F = 0.0000
                      
                      . predict u, u
                      
                      . tabstat u, by(company)
                      
                      Summary for variables: u
                           by categories of: company 
                      
                       company |      mean
                      ---------+----------
                             1 |  2856.216
                             2 |  689.3324
                             3 |  600.7158
                             4 | -493.4693
                             5 | -1156.935
                             6 | -757.4543
                             7 |  -1143.79
                             8 | -496.1194
                             9 | -918.5715
                            10 | -1051.339
                      ---------+----------
                         Total |  9.04e-06
                      --------------------
                      
                      .

                      Comment


                      • #12
                        Oh yes, that's right. I was thinking that since the equation for the singletons was
                        Code:
                        my = a + mx b + noise
                        they would all have the same vi, but you're right it depends on their respective yi, since that is their group mean. Sorry for the confusion.
                        Alfonso Sanchez-Penalver

                        Comment

                        Working...
                        X