Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to choose between Poisson Regression Model and Negative Binomial Model ?

    The dependent variable is integer type. So I think I should use Poisson Regression Model or Negative Binomial Model. Firstly, I tried Poisson Regression Model, the results is as follows:
    Click image for larger version

Name:	TIM图片20180624203228.png
Views:	1
Size:	9.3 KB
ID:	1450390


    Then, I tried Negative Binomial Model, the result is as follows:
    Click image for larger version

Name:	2.png
Views:	1
Size:	11.1 KB
ID:	1450391


    As we can see, the LR test of alpha=0 is significant, so I should use Negative Binomial Model. However, the Pseudo R2 of Negative Binomial Model (0.0393) is smaller than that of Poisson Regression Model (Pseudo R2=0.1254), that is to say, the goodness of fitting of Poisson Regression Model is bigger than Negative Binomial Model. So the results are contradictory, should somebody help me with this question? Which model should I use? Thank you very much!
    Last edited by HAN YUE; 24 Jun 2018, 06:48.

  • #2
    Help please!

    Comment


    • #3
      Han:
      you do not say if you performed -estat go- after -poisson- and, if so, what Stata gave you back.
      If -estat go- shows sign of overdispersion, -poisson- is biased, no matter Pseudo R2 value, and you have to switch to -nbreg- (as it seem to be the case with your data).
      At the top of that, it may also be that an improved specification of -nbreg- can increase the Pseudo R2 value.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        I don't view them as contradictory. An incorrect model with a higher R^2 is not preferable to a correct model with a lower R^2. The SEs are also higher with nbreg, as they should be. I would use nbreg.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Carlo and Richard gave excellent advice.

          Please read the FAQ carefully.

          There are recommendations about sharing data/command/output correctly. You will find advice to avoid snapshots. You will also find tips about posting an informative message. Finally, sending a second message, this Sunday, within 8 minutes of starting a thread can be taken as a bump. Please read the FAQ about (not) bumping.

          With regards to your question, if I understood correctly, it relates to the core knowledge concerning count models, and the information is widely found in textbooks, web, pages, manuals, etc. I truly believe that we can find it within a couple of minutes.

          That being said, R-squared doesn’t imply good model fit.

          You may wish to check the dispersion parameter with - glm - command. You may also make comparisons between different negative binomial models. But, as said before, you’ll find this information easily in the material above mentioned.

          Please don’t bypass the important step of getting the basics of the theory about the model under use, otherwise obstacles and misinterpretation, let alone misspecified models shall be found galore
          Best regards,

          Marcos

          Comment


          • #6
            I am sorry, I am new to Regression Models, so I don't know exactly how to choose a valid model. Here are the commands and the attached is data.

            . import excel "d:\data1.xlsx", sheet("Sheet1") firstrow

            . *poisson y x1 x2 x3 x4 x5 x6 x7

            Iteration 0: * log likelihood = -13480.539 *
            Iteration 1: * log likelihood = -12439.367 *
            Iteration 2: * log likelihood = -12401.522 *
            Iteration 3: * log likelihood = *-12401.47 *
            Iteration 4: * log likelihood = *-12401.47 *

            Poisson regression * * * * * * * * * * * * * * *Number of obs * * = * * *1,760
            * * * * * * * * * * * * * * * * * * * * * * * * LR chi2(7) * * * *= * *3555.51
            * * * * * * * * * * * * * * * * * * * * * * * * Prob > chi2 * * * = * * 0.0000
            Log likelihood = *-12401.47 * * * * * * * * * * Pseudo R2 * * * * = * * 0.1254

            ------------------------------------------------------------------------------
            * * * * * *y | * * *Coef. * Std. Err. * * *z * *P>|z| * * [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            * * * * * x1 | * .0310712 * .0033885 * * 9.17 * 0.000 * * .0244299 * *.0377125
            * * * * * x2 | * -.000045 * 2.06e-06 * -21.79 * 0.000 * * -.000049 * -.0000409
            * * * * * x3 | * .6493568 * .0343303 * *18.91 * 0.000 * * .5820706 * * .716643
            * * * * * x4 | * .0000211 * 4.11e-07 * *51.36 * 0.000 * * .0000203 * *.0000219
            * * * * * x5 | * 1.621192 * .1638862 * * 9.89 * 0.000 * * 1.299981 * *1.942403
            * * * * * x6 | *-.5360483 * .1012103 * *-5.30 * 0.000 * *-.7344168 * -.3376797
            * * * * * x7 | *-2.034442 * .1365581 * -14.90 * 0.000 * *-2.302091 * -1.766793
            * * * *_cons | * 2.599774 * .0553757 * *46.95 * 0.000 * * 2.491239 * *2.708308
            ------------------------------------------------------------------------------


            . nbreg y x1 x2 x3 x4 x5 x6 x7

            Fitting Poisson model:

            Iteration 0: log likelihood = -13480.539
            Iteration 1: log likelihood = -12439.367
            Iteration 2: log likelihood = -12401.522
            Iteration 3: log likelihood = -12401.47
            Iteration 4: log likelihood = -12401.47

            Fitting constant-only model:

            Iteration 0: log likelihood = -5921.2343
            Iteration 1: log likelihood = -5917.4492
            Iteration 2: log likelihood = -5917.4483
            Iteration 3: log likelihood = -5917.4483

            Fitting full model:

            Iteration 0: log likelihood = -5784.0523
            Iteration 1: log likelihood = -5708.8315
            Iteration 2: log likelihood = -5684.6472
            Iteration 3: log likelihood = -5684.6005
            Iteration 4: log likelihood = -5684.6005

            Negative binomial regression Number of obs = 1,760
            LR chi2(7) = 465.70
            Dispersion = mean Prob > chi2 = 0.0000
            Log likelihood = -5684.6005 Pseudo R2 = 0.0393

            ------------------------------------------------------------------------------
            y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            -------------+----------------------------------------------------------------
            x1 | .0261243 .0120297 2.17 0.030 .0025465 .0497022
            x2 | -.0000276 5.42e-06 -5.10 0.000 -.0000382 -.000017
            x3 | .6587957 .1114276 5.91 0.000 .4404015 .8771898
            x4 | .0000763 4.83e-06 15.81 0.000 .0000669 .0000858
            x5 | 1.015735 .6312224 1.61 0.108 -.2214377 2.252909
            x6 | .4501755 .3480728 1.29 0.196 -.2320346 1.132386
            x7 | .8477843 .4359345 1.94 0.052 -.0066316 1.7022
            _cons | 1.495091 .1813951 8.24 0.000 1.139563 1.850619
            -------------+----------------------------------------------------------------
            /lnalpha | -.1735496 .0374455 -.2469414 -.1001577
            -------------+----------------------------------------------------------------
            alpha | .8406755 .0314795 .7811864 .9046947
            ------------------------------------------------------------------------------
            LR test of alpha=0: chibar2(01) = 1.3e+04 Prob >= chibar2 = 0.000
            Attached Files

            Comment


            • #7
              Here is a brief overview of count models if you need it. I am sure there are many others on the web.

              https://www3.nd.edu/~rwilliam/stats3/CountModels.pdf
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              Stata Version: 17.0 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment


              • #8
                Han:
                as per FAQ, please do not post attachments but share vwhat you typed and what Stata gave you back via CODE delimiters. Thanks.
                Please also note that, due to the risk of active contents, most listers (me too) do not download spreadsheets from potentially unsafe sources. Thanks.
                That said, the -estat gof- outcome after -poisson- (that you did not report in your last post) should have pointed you to the right regression model.
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment

                Working...
                X