two part probit ppml model

Andyx Zhang

Join Date: Dec 2023

Posts: 5
#1

two part probit ppml model

10 Dec 2023, 03:54

Hello!

I am modeling predictive factors that affect aid distribution patterns. I have a cross-sectional time series dataset. Explanatory variables, with the exception of two, are largely time variant. The dependent variable is heavily zero-inflated, which means that traditional OLS will not work. I am thus left with several options, including Tobit, Heckman, two-part, and Poisson Pseudo Maximum Likelyhood. I largely lean towards the PPML model in this context

I also follow the theoretical justification that aid a two part selection (where) then allocation (how much) process. The Heckman model is largely unsuitable as there is no independence between the two equations, and I estimate the two models with the same set of covariates which means the identification rests solely on the nonlinerarity of the IMR.

Thus, can someone explain to me why, or recommend some papers on this topic, as to why using a two-part Probit-PPML structure (and retaining all zero values in the second allocation step) would be preferred to using a traditional Logit-OLS structure (and modeling only the positive outcomes in the second allocation step). It could also be really helpful if you could include other statistical diagnostics I could conduct to test for model robustness (e.g. between using fixed and random effects, heteroscedasticity, null test in this regard, etc.) I understand how to do this with traditional OLS, but am unsure in the context of Poisson distribution.

Thank you so much for your help!

Kind regards,
Andy

My code:
*ppml
foreach dv in odaLike oofLike total {
ppmlhdfe `dv' ln_ungaVoting taiwan ln_oresMetalsReal ln_mineralProduction lag_democracy lag_corruptionControl lag_polStability lag_ln_debtGDP lag_ln_gdpCapita lag_ln_population, absorb(year) vce(robust)

matrix b = e(b)

di "Incidence Rate Ratios for model with dependent variable `dv':"
foreach var in ln_ungaVoting taiwan ln_oresMetalsReal ln_mineralProduction lag_democracy lag_corruptionControl lag_polStability lag_ln_debtGDP lag_ln_gdpCapita lag_ln_population {
scalar irr_`var' = exp(b[1,"`var'"])
di "`var': " irr_`var'
}
}

*probit
foreach dv in odaLike oofLike total {
gen `dv'_binary = `dv' > 0
}

foreach dv in odaLike_binary oofLike_binary total_binary {
xtprobit `dv' ln_ungaVoting taiwan ln_oresMetalsReal ln_mineralProduction lag_democracy lag_corruptionControl lag_polStability lag_ln_debtGDP lag_ln_gdpCapita lag_ln_population i.year
}
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2187
#2

10 Dec 2023, 13:24

If you use a two-part model then you condition on positive outcomes in the second part. So you could use Poisson or exponential pseudo MLE for the second part. But if you want to used fixed effects, you shouldn’t really do that for the probit. You can for Poisson, as you know. You could try correlated random effects for both parts.

BTW, the most robust is to use all data and Poisson FE. If the mean is exponential, this is the best choice.
Comment
Andyx Zhang

Join Date: Dec 2023

Posts: 5
#3

11 Dec 2023, 08:28

Originally posted by Jeff Wooldridge View Post

If you use a two-part model then you condition on positive outcomes in the second part. So you could use Poisson or exponential pseudo MLE for the second part. But if you want to used fixed effects, you shouldn’t really do that for the probit. You can for Poisson, as you know. You could try correlated random effects for both parts.

BTW, the most robust is to use all data and Poisson FE. If the mean is exponential, this is the best choice.

Hi Professor Wooldridge,

Thank you for the reply! Could you clarify why I can't use fixed effects for probit models? Would a Logit regression work then?

Also, I followed your advice and used all data with PPML year fixed effects and robust standard errors. Is the convergence significant towards the robustness of the model/how can I evaluate the performance of the model other than the pseudo r value?

Thank you so much!
Comment

Announcement

two part probit ppml model

Comment

Comment