Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ordered Logit Model - % Correct Predicted

    Dear Statalisters,

    I am using an ordinal logistic regression model with 4 groups of the dependent variable. Can you PLEASE help me with how I can find out the % correct predicted (% correctly classified) of my sample?

    Best Regards,

    Michelle
    Master Student in Corporate Finance, Lund University

  • #2
    I assume you are using -ologit-; if you look at the help for predict (under ologit postestimation) you will see that the basic predict will give you 4 new variables; you then have to decide which category people should be predicted to be in; after that you then assign them to that predicted category and compare the predicted categories to the actual - this is not straightforward and I question your goal here

    Comment


    • #3
      The question that might be more straightforward is, how do I get epcp, lstat, estat classification or something in the same category to work on an ologit regression?

      Comment


      • #4
        This is a little klutzy but you could probably build off of something like this:

        Code:
        use http://www3.nd.edu/~rwilliam/statafiles/ordwarm2, clear
        ologit warm yr89 male white age ed prst
        predict p1 p2 p3 p4
        egen maxpred = rowmax(p1-p4)
        gen vpred = .
        forval i = 1/4 {
        replace vpred = `i' if p`i' == maxpred
        }
        tab2 warm vpred
        I am assuming here that you want the predicted category to be the category with the highest predicted probability. With 4 categories, that value could be as low as, say 25.1%. If there are ties then with my approach the highest numbered category wins.

        Personally, I am not that crazy about classification tables. Especially if one category has a relatively large N, you might find that it has the highest predicted value no matter what, e.g. if 60% of the cases are in category 3 then all cases may be predicted as falling into category 3.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Dear Richard,

          Thank you so much for your help - really appreciate it.

          Now I got a new problem, I hope you might could help me with as well.

          I ran the brant test, which showed at significant result. Therefore, the parallel assumption is violated. I then ran a gologit2 test. The BIC test got lowest value for the ologit test. According to this, is it still OK for me to use my ologit regression? What can I otherwise do when the parallel assumption is violated?

          Best,
          Michelle

          Comment


          • #6
            And, what are the consequences of using ologit when parallel assumptions does not hold? Does it only affects the interpretations of the coefficients? Or does it affect the "% correct predicted" in some sense?

            Best

            Comment


            • #7
              I guess Richard could explain you all this better than anyone else, but I might try to.
              1) The brant test indicates to use gologit2 but the BIC test prefers ologit. It might be because the PL line is violated, however the amplitude of the deviation is low. Hence adding parameters diminishes the BIC criterion, while it brings not much to the explicative power of your model.
              This is mostly the case with large sample. How large is yours? In very large sample the Brant test is very often violated, but it could not change much the result. Did the ologit and gologit2 yield similar results? Are all the coefficients in the gologit2 close to each other?
              The gamma option in gologit2 indicates the extent to which the proportional odds assumption is violated by the variable. This is my first clue.

              2) Concerning the consequences, well it does not only affect the interpretations, it might affect their significance, or even their sign, and yes it could affect the % of correctly predicted, because it does not yield to the same estimations, nor prediction.
              Assuming parallel lines when the exact pattern is more a U shape for instance might blur the real effects.
              However if the coefficients all go in the same sens with a similar mignitude -though different - the % of correcly predicted won't change much.

              I hope this helped, and you better wait for Richard comments concerning the differences between ologit and gologit2, or to wait, read all he wrote about this.

              Comment


              • #8
                Thank you Charlie!
                I have 2 samples, one is 350 obs and the other 250 obs. Is that OK to be consider as a large sample?

                Best

                Comment


                • #9
                  Michelle, unfortunatly, this is not what I had in mind when talkin about large sample.
                  Could you report the result of the Brant test (its significance level especially?)

                  The more important now : What tells you the gamma option? You could justify to reject the gologit2 model because its coefficients are not significantly different from ologit.
                  Also with a single glance, could you tell us whether all the coefficient (for a given variable) are close to each other in the gologit2? And close to the ologit?

                  At last -but maybe it should be at first - could you precise how do you run your gologit? Are all the variables assumed to follow NPL? only some? could you write the precise command you launched? If you only switch one variable to NPL, and let the others with the proportion line assumption, you could have chosen the bad one (or not enough) to yield a better fit that the ologit.

                  Anyway, you could also consider to report the ologit model, and detail in appendix the gologit2, if it doesn't show a significantly different result, nor better fit. Only to say that you have already thought about the proportion line assumption.



                  Comment

                  Working...
                  X