Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with logit/probit

    Dear Statalist People,
    i am doing a probit/logit Model. If i Run my model my AIC for logit model is smaller so i should use logit model. But i can not find something to do in stata the test for heteroscedasticity. With a probit I would do the hetprob but what do I need to do if i run a logit model?
    But I do have some other questions. Why can I just not use robust standard errors? And in literature is written that you can not always use the hetprob. But i do not understand when to use what.....?
    And even one more question: how do i interpret under ereturn list the (p_c) value? When it is bigger then one there ist no significance of my model`?
    Thanks a lot!

  • #2
    Elisabeth:
    welcome to the list.
    Please take a look at the FAQ on how to post (more) effectively.
    As far as your first question is concerned, investigating heteroskedasticity of the residuals distribution in -probit- makes sense, because here residuals are expected to be distributed normally; however, normality should not be fulfilled by -logit- residuals, which indeed follow the logistic distribution (please, see http://stats.stackexchange.com/quest.../145265#145265).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      There is an additional problem, the residuals Carlo refers to are the difference between the predicted latent variable and the latent variable, so they cannot be observed. There are many other types of residuals (see e.g. help glm postestimation##predict), but those are not relevant for this particular problem. In fact, the problem of heteroscedasticity in a logit/probit context is a completely different problem than heteroscedasticity in linear regression, it has more to do with finding the right point-estimates than the right standard errors. Since the issue is finding the right point estimates, robust standard errors won't solve any problems (if there are any). It is an area of active research, so there is no concensus yet on what to do. My (controversial) position in this debate is here: http://maartenbuis.nl/wp/oddsratio.html . In that paper I argue that heteroscedasticity is less often a problem then is sometimes claimed, and if there is no problem, then "solutions" for such non-existing problems make the situation worse rather than better. So my position would be to stay away from hetprob unless you are really really really sure you need it.

      p_c returned by hetprobit is the "p-value for heteroskedasticity LR test" As a general rule when interpreting p-values you should start with finding out the null hypothesis. (Actually, the right order is that you should first formulate the null hypothesis you are interested in, and than find an appropriate test for that hypothesis.) Here the null hypothesis would be that the variables in the heteroscedasticity part of the model have, in their current functional form, no effect of the residual variance. A large p-value indicates that you could not reject that hypothesis. That could mean that there is really no heteroscedasticity in your model, or you entered the variables with the wrong functional form, or your data contained insufficient information to detect heteroscedasticity that really exists. The latter is most likely. hetprobit assumes that the S-shaped relationship between the probability and the explanatory variables assumed by the probit is strictly true and small defiations from that S-shape are used to identify the residual variance. You can imagine how little information your data contains for such a parameter, and how fragile such a model is.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment

      Working...
      X