No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to interpret Probit coefficients

    today, I had a discussion with my professor on how to interpret the coefficients of a probit analysis.

    He wondered whether they are different or similair to any other type of regression.

    Would like to hear your thoughts on this.

  • #2
    Probit coefficients are rather in a class by themselves, and their meaning is difficult to put into words. The probit model is perhaps best thought of as modeling a latent outcome y* = b0 + b1x1 + b2x2 + ... + bnxn + error, where the error term has a standard normal distribution, and the observed outcome y is 1 if y* > 0 and 0 otherwise. The coefficients then are marginal effects of the x's on this latent outcome y*.

    In some respects, probit models are similar to logistic models. The latter can also be thought of as models of a latent outcome y* = b0 + b1x1 + b2x2 + ... + bnxn + error. But this time the error distribution is the standard logistic distribution, instead of the normal. These coefficients, however, are more easily understood as log odds ratios , rather than as marginal effects on y*. Unfortunately, there is nothing analogous to log odds ratios for the interpretation of probit coefficients as far as I am aware.

    Now, probing more deeply into the similarity of the logistic and probit models, it is worth noting that the normal and logistic distributions have very similar shapes: when similarly scaled, their graphs look quite similar to the eye and one has to get far into the tails to find much difference between them. The variance of the logistic distribution is (pi^2)/3, whereas that of the standard normal distribution is 1. Given this, it is commonly noted that probit and logistic regressions of the same data tend to lead to the same substantive conclusions. (And, in fact, the preference for one over the other is usually based on historical preferences by the discipline rather than anything statistical or substantive.) Another consequence of the similarity (up to a scale factor) of these distributions is that in general, the logistic regression coefficient will usually be approximately equal to the probit regression coefficient * pi/sqrt(3) [which is approximatey 1.82].

    For those who have grown comfortable working with logistic regression models and (log) odds ratios, when first approaching a probit regression output, it is sometimes helpful to mentally approximately multiply the coefficients by 1.82 to get a "ball park estimate" of what the corresponding results from a logistic regression would probably be close to.

    Hope this helps.


    • #3
      That does help, very interesting!
      One last question, how do I denote the probit coefficient in a formula.
      Currently, I have the following formula but several papers ,use Y, instead of Zyand some write it as E(Zy).

      Zy= α + 𝛽1𝑋1 + 𝛽1𝑋12 + 𝛽1𝐶1+ 𝛽2𝐶2+ 𝛽2𝐶2+ 𝛽3𝐶3i


      • #4
        Let Yi be the dependent 0/1 variable, and let Xi be the vector of explanatory variable. The probit coefficient in a formula appears as

        Prob(Yi=1) = F(Xi'b)= F(X1*b1 + X2*b2 +...+)

        and F(.) is the standard normal cumulative distribution function, customary denoted by the Greek capital letter Phi, which in latex you can get by \Phi.