Dear all,
I am facing the following issue and would appreciate some (hopefully) enlightening comments.
I have a panel dataset with a binary endogenous variable (i.e. treatment evaluation) and the problem of non-random selection. I want to investigate the treatment effect while applying different models AND accounting for model uncertainty with respect to the covariates.The stata commands
For the etreg command I already managed to average the coefficients of 1024 regressions. It was basically just one big loop. Regarding the model weights, I followed
"Buckland, S. T., Burnham, K. P., & Augustin, N. H. (1997). Model selection: an integral part of inference. Biometrics, 603-618.". In a nutshell, one needs to obtain the BIC (Bayesina Information Criterion) after each regression (Stata command: estat ic), summ all up (= denominator), calculate the respective model weights using each BIC (= numerator) and use them to calculate the averaged coefficient. Hence - to obtain BIC - maximum likelihood needs to be applied. This worked well for the etreg command.
However, for the heckman ML estimation, some model specifications do not converge - even though they did in the etreg-case. I already changed the regressors back and forth but still some combinations of covariates won't converge. I can't really wrap my mind around why it is working for etreg, but not for heckman --> both use probit in first stage and OLS in second?!
For the heckman, twostep calculation it works perfectly fine, but I am not able to obtain the BIC (no log-likelihood values). I guess for xtivreg I could simply switch to "xtivreg2, fe liml" - option and work my way around.
Hence, I would appreciate some comments on the following:
I am facing the following issue and would appreciate some (hopefully) enlightening comments.
I have a panel dataset with a binary endogenous variable (i.e. treatment evaluation) and the problem of non-random selection. I want to investigate the treatment effect while applying different models AND accounting for model uncertainty with respect to the covariates.The stata commands
- etreg (= treatment effect regression with probit in the first, regression in the second stage)
- xtivreg (I guess no explanation is needed)
- and heckman (probit first, regression in the second with the inverse mills ratio)
For the etreg command I already managed to average the coefficients of 1024 regressions. It was basically just one big loop. Regarding the model weights, I followed
"Buckland, S. T., Burnham, K. P., & Augustin, N. H. (1997). Model selection: an integral part of inference. Biometrics, 603-618.". In a nutshell, one needs to obtain the BIC (Bayesina Information Criterion) after each regression (Stata command: estat ic), summ all up (= denominator), calculate the respective model weights using each BIC (= numerator) and use them to calculate the averaged coefficient. Hence - to obtain BIC - maximum likelihood needs to be applied. This worked well for the etreg command.
However, for the heckman ML estimation, some model specifications do not converge - even though they did in the etreg-case. I already changed the regressors back and forth but still some combinations of covariates won't converge. I can't really wrap my mind around why it is working for etreg, but not for heckman --> both use probit in first stage and OLS in second?!
For the heckman, twostep calculation it works perfectly fine, but I am not able to obtain the BIC (no log-likelihood values). I guess for xtivreg I could simply switch to "xtivreg2, fe liml" - option and work my way around.
Hence, I would appreciate some comments on the following:
- Since the Stata command "xtivreg, fe" also saves e(ll) and therefore "estat ic" is applicable, I could also work my way manually around the heckman: Probit in the first stage, Calculation of the inverse mills ratio, estimation of the second stage with xtivreg, fe . But how would I manually adjust the standard errors (necessary for Heckman) ? I couldn't find anything online. Some comments simply said "use the Stata command. This will do the job"
- Is anyone aware of model weights that are not related to maximum likelihood estimation? Couldn't I simply use any coefficient of determination (such as r squared) and calculate the model weights in the same spirit as explained above [-> Model weight 1 = BIC model 1 / Sum(BIC all models) ]