why is there big difference between zip and zinb using same variables?

wendy chang zhou

Join Date: Jun 2019

Posts: 5
#1

why is there big difference between zip and zinb using same variables?

06 Mar 2020, 15:12

Hi everyone. I am running zero-inflated poisson models because of many zeros (1/4 of the outcome). I run zip and zinb before and found two models including same predictors produced similar results. However in the present study, zip and zinb models produced quite different results in count equations (similar results in logistic regression part). It's wired. The results are as follows:

ZIP model
---------------------------------------------------------------------------------
Outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
outcome (count equation) |
1sc| .103509 .0356089 2.91 0.004 .0337169 .1733011
Rpr| -.0314016 .0062603 -5.02 0.000 -.0436716 -.0191316
Rex| .0233748 .0058477 4.00 0.000 .0119134 .0348361
Rsu| -.0073679 .0051857 -1.42 0.155 -.0175317 .0027959
Ppr| .0176424 .0052335 3.37 0.001 .007385 .0278999
Pex| .0106725 .0048924 2.18 0.029 .0010836 .0202615
Psu| .0108973 .0035661 3.06 0.002 .003908 .0178866
_cons | 1.210286 .1851025 6.54 0.000 .8474913 1.57308
----------------+----------------------------------------------------------------

ZINB model
---------------------------------------------------------------------------------
outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
outcome (count equation) |
1sc | .116671 .067785 1.72 0.085 -.0161852 .2495271
Rpr | -.0364287 .0113316 -3.21 0.001 -.0586382 -.0142193
Rex| .0219059 .0112583 1.95 0.052 -.0001599 .0439717
Rsu| -.0055789 .0094421 -0.59 0.555 -.0240852 .0129273
Ppr| .0168384 .0097909 1.72 0.085 -.0023514 .0360282
Pex| .0142822 .0094063 1.52 0.129 -.0041538 .0327183
Psu| .0104673 .0067804 1.54 0.123 -.002822 .0237566
_cons | 1.131097 .3383747 3.34 0.001 .4678948 1.794299

I would appreciate it if anyone could help
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

06 Mar 2020, 15:36

You have been duped by focusing on p-values and statistical significance! If you look at the coefficients themselves you will see that in every instance they are very similar. The negative binomial model models the error on a distribution with (potentially much) greater variance than the Poisson. So the model offers less precise estimates of your coefficients. (Essentially, because the negative binomial error distribution tolerates a wider range of residuals than the Poisson, a wider range of coefficient estimates is compatible with the negative binomial model.) Even though the coefficient estimates are, for practical purposes, the same in both models, the standard errors, reflecting the variance in the error distribution, are larger with the negative binomial, so your p-values have increase as well. But in fact, both models are telling you pretty much the same things: the coefficients are nearly the same, and even the confidence intervals haven't changed by very much, even though some that just barely excluded zero before now just barely include it.

The confusion you are facing is one of the many reasons it is time for people to abandon statistical significance, as recommended by the American Statistical Association. See https://www.tandfonline.com/doi/full...5.2019.1583913 for the "executive summary" and https://www.tandfonline.com/toc/utas20/73/sup1 for all 43 supporting articles. Or https://www.nature.com/articles/d41586-019-00857-9 for the tl;dr.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#3

07 Mar 2020, 04:19

Dear wendy chang zhou,

I agree with Clyde's comment that the results are not that different. However, I wonder whether you really need a zero inflated model; a high percentage of zeros does not mean that you have zero inflation. For example, a sample of Poisson distribution with a mean of 0.01 will have about 99% of zeros and there is no zero inflation at all.

Best wishes,

Joao
2 likes
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4987
#4

07 Mar 2020, 05:25

Welcome to Statalist.

Your output is very hard to read. You should use code tags instead. See pt 12 of the Statalist FAQ on asking questions effectively.

I agree with Clyde. The coefficients are not that different. If you use Poisson when you should be using nbreg, the standard errors and significance tests will tend to be too optimistic, just as they are in your results. For more, see

https://www3.nd.edu/~rwilliam/xsoc73994/CountModels.pdf

especially p. 17.

Joao is not the only one to question the use of zero-inflated models. Paul Allison expresses qualms too. See

https://www3.nd.edu/~rwilliam/xsoc73994/CountModels.pdf

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
2 likes
Comment
wendy chang zhou

Join Date: Jun 2019

Posts: 5
#5

14 Mar 2020, 19:35

Thank you very much Clyde. I do appreciate your clear explanation.
Comment
wendy chang zhou

Join Date: Jun 2019

Posts: 5
#6

14 Mar 2020, 19:45

Thanks for your reply and recommendation of literature. Sometimes, whether we need a zero-inflated model is a little bit subjective. Negative binomial regression seems reasonable in my study, but the statistics showed that ZINB fitted better than Negative binomial regression.
Comment
Gaston Fernandez

Join Date: Jul 2015

Posts: 27
#7

03 Apr 2020, 11:09

Hello.

You could also try a comparison of the fit of alternative count models.

For example:

Code:

findit countfit
Comment

Announcement

why is there big difference between zip and zinb using same variables?

Comment

Comment

Comment

Comment

Comment

Comment