Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the method xtreg, fe uses to predict the fixed effects?

    Dear users,

    What is the method xtreg, fe uses to predict the fixed effects? Does it run an auxiliary areg regression?

    So far I have not found this in Stata documentation (Methods and Formulas, etc), nor in this forum.

    I ask this because they always seem to approximate a normal distribution. However, the theory behind the FE estimator (so far as I know), never assumes a formal distribution behind these effects, just some linear correlation with some regressors. If normality is part of the assumptions in their prediction, I would like to try the results under different assumptions.

    Thanks!
    Alan.

  • #2
    The manual for xtreg says under the heading "Methods and formulas", page 423:
    From the estimates \(\hat{\alpha}\) and \(\hat{\beta}\), estimates \(u_i\) of \(\nu_i\) are obtained as \(u_i = \bar{y}_i - \hat{\alpha} - \bar{x}_i \hat{\beta}\).
    There is no normal distribution involved in this prediction.

    You should not put too much emphasis on the interpretation of the estimated fixed effects, in particular if you have a setting with small T and large N. The individual fixed effects would be severely biased and an interpretation of its distribution would be meaningless.
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Oh, thank you Sebastian! I overlooked that part in the manual.

      Regarding your comment on fixed effects. How is it possible that at the same time xtreg, fe estimates the coefficients consistently but estimates the FE inconsistently? That does not make sense to me. Judging from the equation by which they are calculated, if parameters are consistent, then I don't see why FE are inconsistent.

      Also, it seems from the formula you gave me that the fixed effects include all time-invariant factors omitted from the FE estimation (e.g. race, gender). Can they not be recovered regressing the predicted effects on them?
      Last edited by Alan Brito; 20 Aug 2016, 08:58.

      Comment


      • #4
        The coefficient estimates are obtained from a demeaned model that does not involve the fixed effects any more. This allows consistent estimation of these coefficients.

        The fixed-effects estimates are functions of the data for each individual i. If you only have a fixed number of time periods, say T = 3, then you are using only these few observations to estimate one fixed effects parameter. Adding more individuals does not add any information that could help to estimate the fixed effect for an individual already in the sample. You are using just these 3 observations for that particular individual, no matter whether N is 10, 100, 1000, 10000, ... In other words, while \(\hat{\alpha}\) and \(\hat{\beta}\) are consistently estimated, the averages \(\bar{y}_i\) and \(\bar{x}_i\) are not, and therefore the estimates \(u_i\) will be inconsistent.
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Regarding your second question: Yes, under the assumption that those time-invariant factors are uncorrelated with the unobserved "fixed effects" (or by using appropriate instruments, keyword: Hausman-Taylor estimation), you can recover their effects in such a second stage. However, you should be aware that the conventional standard errors are inconsistent at the second stage. You can find some earlier discussions about this topic on Statalist:
          Fixed Effects and time-invariant variables
          https://www.kripfganz.de/stata/

          Comment


          • #6
            But is it that the estimations are inconsistent in the strict sense of the word? This is that even if T increases the bias does not dissapear? Or by inconsistent you actually mean biased due to the small sample?

            I have just run a regression of the fixed effects against the time-invariant factors and they are highly significant, and the signs are as expected! You mention that the s.e. are inconsistent here. I tried bootstrapping the s.e. using cluster and no cluster, with 100 reps, that the p-values remain at 0.000. I have T=5. Yes, not great, but it might seem that there is something there. Do you thing this is defensible?

            Comment


            • #7
              The estimates are inconsistent for fixed T with N going to infinity. If you instead consider asymptotics where T goes to infinity, there is no inconsistency. With T = 5, there is hardly any argument for the latter.

              Bootstrapping is a valid strategy to obtain consistent standard errors. But note that you need to bootstrap both estimation stages jointly, not just the second stage. The inconsistency of the standard errors is a consequence of using an estimated dependent variable at the second stage. If you just bootstrap the second-stage estimation, you can not heal this problem.
              https://www.kripfganz.de/stata/

              Comment


              • #8
                Thank you Sebastian. I will look into a book or paper where they lay out the asymptotics for T going to infinity. As an important part of my arguments is behind the estimated fixed effects, I need to be clear on this. But it is looking good so far.

                Regarding bootstrapping in two-stages, that is clearly the way forward. Do you do this by bootstrapping a program where the two-step proceedure is carried out? using bootstrapping, etc etc: etc etc seem to work only for single line commands.

                Comment


                • #9
                  Indeed, you would need to write a small program that does the two-stage estimation, and then you can call this program with bootstrap.
                  https://www.kripfganz.de/stata/

                  Comment

                  Working...
                  X