Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • why is there big difference between zip and zinb using same variables?

    Hi everyone. I am running zero-inflated poisson models because of many zeros (1/4 of the outcome). I run zip and zinb before and found two models including same predictors produced similar results. However in the present study, zip and zinb models produced quite different results in count equations (similar results in logistic regression part). It's wired. The results are as follows:

    ZIP model
    ---------------------------------------------------------------------------------
    Outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    outcome (count equation) |
    1sc| .103509 .0356089 2.91 0.004 .0337169 .1733011
    Rpr| -.0314016 .0062603 -5.02 0.000 -.0436716 -.0191316
    Rex| .0233748 .0058477 4.00 0.000 .0119134 .0348361
    Rsu| -.0073679 .0051857 -1.42 0.155 -.0175317 .0027959
    Ppr| .0176424 .0052335 3.37 0.001 .007385 .0278999
    Pex| .0106725 .0048924 2.18 0.029 .0010836 .0202615
    Psu| .0108973 .0035661 3.06 0.002 .003908 .0178866
    _cons | 1.210286 .1851025 6.54 0.000 .8474913 1.57308
    ----------------+----------------------------------------------------------------

    ZINB model
    ---------------------------------------------------------------------------------
    outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    outcome (count equation) |
    1sc | .116671 .067785 1.72 0.085 -.0161852 .2495271
    Rpr | -.0364287 .0113316 -3.21 0.001 -.0586382 -.0142193
    Rex| .0219059 .0112583 1.95 0.052 -.0001599 .0439717
    Rsu| -.0055789 .0094421 -0.59 0.555 -.0240852 .0129273
    Ppr| .0168384 .0097909 1.72 0.085 -.0023514 .0360282
    Pex| .0142822 .0094063 1.52 0.129 -.0041538 .0327183
    Psu| .0104673 .0067804 1.54 0.123 -.002822 .0237566
    _cons | 1.131097 .3383747 3.34 0.001 .4678948 1.794299

    I would appreciate it if anyone could help

  • #2
    You have been duped by focusing on p-values and statistical significance! If you look at the coefficients themselves you will see that in every instance they are very similar. The negative binomial model models the error on a distribution with (potentially much) greater variance than the Poisson. So the model offers less precise estimates of your coefficients. (Essentially, because the negative binomial error distribution tolerates a wider range of residuals than the Poisson, a wider range of coefficient estimates is compatible with the negative binomial model.) Even though the coefficient estimates are, for practical purposes, the same in both models, the standard errors, reflecting the variance in the error distribution, are larger with the negative binomial, so your p-values have increase as well. But in fact, both models are telling you pretty much the same things: the coefficients are nearly the same, and even the confidence intervals haven't changed by very much, even though some that just barely excluded zero before now just barely include it.

    The confusion you are facing is one of the many reasons it is time for people to abandon statistical significance, as recommended by the American Statistical Association. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr.

    Comment


    • #3
      Dear wendy chang zhou,

      I agree with Clyde's comment that the results are not that different. However, I wonder whether you really need a zero inflated model; a high percentage of zeros does not mean that you have zero inflation. For example, a sample of Poisson distribution with a mean of 0.01 will have about 99% of zeros and there is no zero inflation at all.

      Best wishes,

      Joao

      Comment


      • #4
        Welcome to Statalist.

        Your output is very hard to read. You should use code tags instead. See pt 12 of the Statalist FAQ on asking questions effectively.

        I agree with Clyde. The coefficients are not that different. If you use Poisson when you should be using nbreg, the standard errors and significance tests will tend to be too optimistic, just as they are in your results. For more, see

        https://www3.nd.edu/~rwilliam/xsoc73994/CountModels.pdf

        especially p. 17.

        Joao is not the only one to question the use of zero-inflated models. Paul Allison expresses qualms too. See

        https://www3.nd.edu/~rwilliam/xsoc73994/CountModels.pdf
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Thank you very much Clyde. I do appreciate your clear explanation.

          Comment


          • #6

            Thanks for your reply and recommendation of literature. Sometimes, whether we need a zero-inflated model is a little bit subjective. Negative binomial regression seems reasonable in my study, but the statistics showed that ZINB fitted better than Negative binomial regression.

            Comment


            • #7
              Hello.

              You could also try a comparison of the fit of alternative count models.

              For example:
              Code:
               findit countfit

              Comment

              Working...
              X