Dear Statalist community,
I am currently working on a research project that involves estimating a Poisson Pseudo-Maximum Likelihood (PPML) model in Stata. My dataset exhibits a high proportion of zero values in the dependent variable -- as it is in a dyadic format, origin-to-destination firm migration.
Is there a rule of thumb or guideline for assessing the suitability of using the PPML model when dealing with high proportions of zeros in the dependent variable? What factors should I consider when evaluating the impact of zero inflation on the model's performance? For instance, 90% of zeros are considered too high? 99% of zeros cause severe problems?
I am already aware that PPML is consistent and well-behaved with many zeros, but how big is "many"?
Thank you for your help and insights.
Kind Regards,
Mauricio.
I am currently working on a research project that involves estimating a Poisson Pseudo-Maximum Likelihood (PPML) model in Stata. My dataset exhibits a high proportion of zero values in the dependent variable -- as it is in a dyadic format, origin-to-destination firm migration.
Is there a rule of thumb or guideline for assessing the suitability of using the PPML model when dealing with high proportions of zeros in the dependent variable? What factors should I consider when evaluating the impact of zero inflation on the model's performance? For instance, 90% of zeros are considered too high? 99% of zeros cause severe problems?
I am already aware that PPML is consistent and well-behaved with many zeros, but how big is "many"?
Thank you for your help and insights.
Kind Regards,
Mauricio.
Comment