Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculation of predicted probabilities

    When calculating the predicted probabilities in a logistic regression model, do we consider all the variables or just the significant ones?
    For eg: Let's say my model has: dependent variable Y and 3 dependent variables Xi out of which coefficients of X1 an X2 are significant whereas X3 is not significant. So for calculating the the predicted probabilities will I use just X1*beta1 + X2*beta2 or include X3*beta3 as well?

  • #2
    Stata takes you seriously. So if you ask stata to compute a model that includes x1 x2 and x3, then it will compute a model that contains those variables. The predicted probabilities are just a representation of that model, so they too include all those variables.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks a lot Maarten! That was very helpful.

      From what I understand, the logic behind that is that the variables included in the model should be important and hence are included in the predicted probability calculation. If they are not important, they should not be there in the model.

      However, how do we simultaneously say that a variable does not have a significant statistical effect on the dependent variable and include it in calculation of prediction probabilities? It feels like slightly contradicting to me.

      Comment


      • #4
        Originally posted by Maarten Buis View Post
        Stata takes you seriously. So if you ask stata to compute a model that includes x1 x2 and x3, then it will compute a model that contains those variables.
        Do other software also do the same? Are you aware of any other software that differ?

        Comment


        • #5
          If there's any software that differs, it is not worth serious attention. In fact, it should be avoided like the proverbial plague.

          The key point is simple. If you fit a model and then ask for predictions, Stata uses the model you just fitted. Replacing coefficients of predictors that weren't significant with zero would be contrary to fact, unless by a quite extraordinary coincidence all the coefficients were exactly zero. In any case, how would software know your cut-off?

          What you could do is refit your model with just the predictors that satisfy your predilections, but watch out: many researchers regard that as cherry-picking and in particular cases it could easily conflict with other desiderata, such as quantifying effects to the extent possible, keeping predictors together that belong together, consistency with previous work, etc. In fact, it is not even guaranteed that those predictors will remain significant.

          Punar: We prefer full real names here. Although people in some cultures have just one name, most cultures work with given names and family names, and we ask that you follow suit.

          Comment


          • #6
            If you are serious, you decide - before knowing the result - which model you want to investigate. Stata can - with the stepwise: prefix - include or eliminate predictors based upon significance, but don't do that. See why at:

            http://www.stata.com/support/faqs/st...sion-problems/

            Comment


            • #7
              Think of it this way: Statistical significance has little to do with importance. It is much more a measure of accuracy.

              Best
              Daniel

              Comment


              • #8
                If an effect is statistically insignificant, you can't rule out that it is 0. On the other hand, you also can't rule out that the effect is actually larger than what was estimated. If, say, the estimated coefficient was 10, and the confidence interval ran from -10 to 30, it would make about as much sense to treat the value as 20 as it does to treat it as 0. And nobody would seriously consider doing that!

                Even if effects don't differ from 0, the estimated effects of other coefficients can be affected by the inclusion of those variables in the model.

                In short, if you think the effect of a variable should be treated as 0, then drop it from the model and re-estimate the remaining coefficients. Don't just reset it to 0 yourself while leaving all the other coefficients as is.
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                StataNow Version: 19.5 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment

                Working...
                X