Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Do lost observations in FE Logit regression lead to bias?

    Hi,
    I am running a fixed-effects logit regression (xtlogit, fe). However, about 10% of observations are dropped, because these observations' status does not change. Will this lead to any bias to the estimation result?

    Thank you very much.

  • #2
    Alex:
    you do not say what is the amount, in absolute terms, of that 10%.
    Anyway, at its face value I would not consider that result as a bias, but simply a matter of fact.
    That said, is it acceptable in your research field and/or contrasting your results with those of other articles dealing with the same research topic, that 10% of the panel_id do not change status across a given number of dara waves?
    As an aside, posting what you typed and what Stata gave you back (as per FAQ) can well improve your chances of getting helpful replies.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Alex,

      I agree with Carlo.

      As the identification strategy of xtlogit is based on each individual (or whatever the unit of analysis is), observations recording solely one outcome for a specific individual are not useful to identify what leads to changes in the outcome variable. Therefore, I would not call it bias, but you need to be careful with the interpretational of the results. The results from xtlogit have a (slightly) different meaning than those obtained from logit.

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Alex:
        you do not say what is the amount, in absolute terms, of that 10%.
        Anyway, at its face value I would not consider that result as a bias, but simply a matter of fact.
        That said, is it acceptable in your research field and/or contrasting your results with those of other articles dealing with the same research topic, that 10% of the panel_id do not change status across a given number of dara waves?
        As an aside, posting what you typed and what Stata gave you back (as per FAQ) can well improve your chances of getting helpful replies.
        Dear Carlo

        Thank you very much. I think my essential worries about the fixed-effects logit lie in two things. First, if there are dummy independent variables on the right hand side, then xtlogit fe will control for them, but not estimate them. Second, some people suggest that if predictors have larger between variation than within variation, then fixed-effects logit model will produce too large standard errors

        so if all independent variables in a panel database have apparently larger between standard deviation than within standard deviation, does it mean that xtlogit fe is not a good idea (despite of a Hausman test in favour of fixed-effects)?

        thank you again!

        Comment


        • #5
          It doesn't cause bias. The assumptions required for the FE logit model are that the covariates are strictly exogenous conditional on the heterogeneity, that the probability has the logistic form, and that the observations are conditionally independent over time. The observations where y(i,t) doesn't change are simply uninformative about beta.What is perhaps less well known is that the same thing happens in a linear model: If y(i,t) does not change over time, for a given i then it does not contribute to the estimation, either. It's just that programs don't tell you that these observations are being "dropped."

          Comment


          • #6
            Originally posted by Jeff Wooldridge View Post
            It doesn't cause bias. The assumptions required for the FE logit model are that the covariates are strictly exogenous conditional on the heterogeneity, that the probability has the logistic form, and that the observations are conditionally independent over time. The observations where y(i,t) doesn't change are simply uninformative about beta.What is perhaps less well known is that the same thing happens in a linear model: If y(i,t) does not change over time, for a given i then it does not contribute to the estimation, either. It's just that programs don't tell you that these observations are being "dropped."
            Dear Prof. Wooldridge,

            Thank you very much. But another consideration is that for dummy independent variables (e.g. if a country is landlocked) that does not change within each individual over time, FE logit model will control for them, but not estimate them. I think this relates to a more general problem that FE will generate too large standard errors when independent variables' between variation dominates within variation. So does it mean that FE logit is not an appropriate choice for equation with dummy independent variables or equation in which independent variables have much higher between variation than within variation (regardless of the result of Hausman test)?
            Thank you again!

            Comment


            • #7
              I guess the higher standard errors in FE models are not to be thought of as biased (away from zero and thus as "too" high) in any way. They accurately reflect the fact that you are not using all available information in the data when estimating your effects. That is not a "problem". It is a question of efficiency and, of course, an estimator which uses both, the within and between variance will be the efficient one. However, you need to ask yourself what an efficient estimator is good for, if it is inconsistent (according to the Hausman test). By the way, a bayesian statistician might prefer a (slightly) biased estimate with minimal variance over an unbiased one with huge variance.

              An interesting question is how the estimated effects can be generalized. If units do not have any change in the outcome, then they do not contribute anything to the estimated effect. However that estimated effects can then not be generalized to those units.

              Best
              Daniel

              Comment


              • #8
                I recently received an email about the post in #5 and I should correct the incorrect statement there. In linear FE estimation, a unit is dropped only if all elements of x(i,t) do not vary with t. If y(i,t) does not vary with t then then the second set of summations that define the within estimator does not depend on unit i, but the first some -- the sum of the outer products of x(i,t) - xbar(i) -- will. Because we usually include time dummies at a minimum, situations where x(i,t) - xbar(i) = 0 for t = 1, ..., T for any unit i are rare.

                In other words, the linear case is not like FE (conditional) logit or FE Poisson.

                Comment

                Working...
                X