Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • High value of LR Chi2 test of an ordered probit model with a large number of observations

    Is it good to obtain a high value of LR Chi2 test after running an ordered probit regression on a large number of observations?
    I ran the ordered probit regression, and I got the following results:
    Number of obs = 2,135

    LR chi2(19) = 1899.87

    Prob > chi2 = 0.0000

    Pseudo R2 = 0.2183

    Log likelihood = -3401.9889

    Additionally, each of the independent variables' level of significance was high.


  • #2
    Your dataset is too big to post here, but I can't see any problem if you were to copy and paste the displayed results. Otherwise I am not clear what is puzzling you: do you think those results are contradictory?

    When you say that the independent variables have high levels of significance, do you mean that displayed P values are close to zero?

    Comment


    • #3
      Shlair:
      welcome to this forum.
      Like Nick, I do not understand what is the issue you're complaining about.
      As per FAQ, please post what you typed and what Stata gave you back. Thanks.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Many thanks, Carlo Lazzaro, and Nick Cox for your replies.
        In response to your question, Mr. Nick Cox: yes, the marjority of the independent variables displayed close to zero P values.
        In response to your question, Mr. Carlo Lazzaor, I am not complaining; I am just wondering whether having a high LR Chi2 is considered a good result or not.

        I will report the results here:


        Iteration 0: log likelihood = -4351.9252
        Iteration 1: log likelihood = -3418.0734
        Iteration 2: log likelihood = -3402.043
        Iteration 3: log likelihood = -3401.9889
        Iteration 4: log likelihood = -3401.9889

        Ordered probit regression Number of obs = 2,135
        LR chi2(19) = 1899.87
        Prob > chi2 = 0.0000
        Log likelihood = -3401.9889 Pseudo R2 = 0.2183

        -----------------------------------------------------------------------------------
        Q158 | Coefficient Std. err. z P>|z| [95% conf. interval]
        ------------------+----------------------------------------------------------------
        Q6 | -.0410939 .0363945 -1.13 0.259 -.1124259 .0302381
        Q94 | .0731771 .0687433 1.06 0.287 -.0615573 .2079114
        Q159 | .4394286 .014358 30.61 0.000 .4112875 .4675698
        Q160 | -.0326823 .0118063 -2.77 0.006 -.0558222 -.0095425
        Q161 | .0021466 .0124849 0.17 0.863 -.0223232 .0266165
        Q162 | -.0432835 .011404 -3.80 0.000 -.065635 -.0209321
        Q163 | .1759674 .013153 13.38 0.000 .1501881 .2017467
        Q164 | .0597167 .0127242 4.69 0.000 .0347777 .0846557
        Q172 | .0002827 .013051 0.02 0.983 -.0252968 .0258622
        SECVALWGT | .2839762 .6329277 0.45 0.654 -.9565393 1.524492
        |
        G_TOWNSIZE |
        2,000-5,000 | -.8526724 .3248863 -2.62 0.009 -1.489438 -.2159069
        5,000-10,000 | -.3020151 .176061 -1.72 0.086 -.6470884 .0430581
        10,000-20,000 | -.0205398 .2187824 -0.09 0.925 -.4493455 .4082659
        20,000-50,000 | -.3022471 .1488347 -2.03 0.042 -.5939577 -.0105365
        50,000-100,000 | -.3864841 .1503259 -2.57 0.010 -.6811175 -.0918507
        100,000-500,000 | -.2445002 .1353398 -1.81 0.071 -.5097614 .020761
        500,000 and more | -.3967275 .1528847 -2.59 0.009 -.696376 -.0970791
        |
        H_URBRURAL | .3069201 .1923847 1.60 0.111 -.070147 .6839871
        H_SETTLEMENT | -.1353574 .0873805 -1.55 0.121 -.3066201 .0359052
        ------------------+----------------------------------------------------------------
        /cut1 | 1.240857 .6625347 -.0576867 2.539401
        /cut2 | 1.54598 .6606799 .2510707 2.840888
        /cut3 | 2.115808 .6595719 .8230706 3.408545
        /cut4 | 2.607556 .6601363 1.313712 3.901399
        /cut5 | 3.290127 .6618758 1.992875 4.58738
        /cut6 | 3.890282 .6633755 2.59009 5.190474
        /cut7 | 4.518814 .6645098 3.216399 5.82123
        /cut8 | 5.195655 .665834 3.890644 6.500666
        /cut9 | 5.732438 .6665422 4.426039 7.038836
        -----------------------------------------------------------------------------------

        .

        Comment


        • #5
          As I understand it the statistical significance of the model is really not in doubt. The P-value for model is not just reported as 0.0000 (meaning < 0.00005) but is much, much smaller. The key questions are whether you need all those predictors or could produce a better model in other ways.

          That seems more of a (social) science issue than a statistical one.

          Comment


          • #6
            I see. Thanks indeed for your reply Mr. Nick Cox. Do you think that it is better to produce a better model? I am looking forward to your kind recommendations

            Comment


            • #7
              And what are the repercussions of having much smaller than 0.05 P value?

              Comment


              • #8
                neither what you write nor the posted results are clear (please read the FAQ and follow its advice so that what you post becomes easier to read); if I understand you correctly you are wondering about the import of the overall test of the model; this is not an important test and just looks at the question of whether anything in the model is important; the null is that nothing is important so it is desirable to have small p-value (i.e., <0.05 is good); however, this p-value should not be used for comparing models

                Comment


                • #9
                  Thanks for your reply Rich Goldstein

                  Comment


                  • #10
                    Originally posted by Shlair Alzanganee View Post
                    And what are the repercussions of having much smaller than 0.05 P value?
                    As always, the issue is not related to statistical significance, but specifying a regression model that gives a true and fair view of the data generating process you're investigating.

                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Thanks for your reply Mr. Lazzaro

                      Comment


                      • #12
                        Some fields live and die by P < 0.05. But in any field where variables might have some effect even if small, leaving the variable in the model gives it scope to do its best and shows you haven't ignored it, even if its P > 0.05.

                        Everyone dislikes overfitting, but even objective criteria don't give identical answers, and competent people can disagree in good faith on what is overfitting.

                        Comment


                        • #13
                          Shlair:
                          about the potential clash between fitting and prediction, see -lasso- entry, Lasso for prediction and model selection, page158, Stata .pdf manual.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment

                          Working...
                          X