I want to model the effect of an endogenous (left-censored) explanatory variable on a continuous outcome variable using an unbalanced panel dataset. For this, I use a control function approach.
Because of the left-censoring, the selection equation/reduced form equation is estimated with a correlated random effects tobit model (xttobit in Stata, with time averages of the time-varying variables as additional explanatory variables, i.e., Mundlak). From this model, I compute the generalised residuals. I use the formula from Wooldridge (2014) for this.
I then estimate the outcome equation using pooled OLS, again with time averages of the time-varying variables (Mundlak). To this model I add the generalised residuals from the first step.
Wooldridge (2015) shows that a pooled OLS estimator with time averages is equivalent to the fixed effects estimator. When I compare my results for the outcome equation between both estimators I do not get the same results. I found that the reason for this is that I do not add the time average of the generalised residuals to my pooled OLS model.
Consequently, my questions are:
Sources:
Because of the left-censoring, the selection equation/reduced form equation is estimated with a correlated random effects tobit model (xttobit in Stata, with time averages of the time-varying variables as additional explanatory variables, i.e., Mundlak). From this model, I compute the generalised residuals. I use the formula from Wooldridge (2014) for this.
I then estimate the outcome equation using pooled OLS, again with time averages of the time-varying variables (Mundlak). To this model I add the generalised residuals from the first step.
Wooldridge (2015) shows that a pooled OLS estimator with time averages is equivalent to the fixed effects estimator. When I compare my results for the outcome equation between both estimators I do not get the same results. I found that the reason for this is that I do not add the time average of the generalised residuals to my pooled OLS model.
Consequently, my questions are:
- Do I have to add the time average of the generalised residuals to the pooled OLS model?
- Can I choose to use FE estimation? In literature, it seems as if pooled OLS or CRE is mostly used for the control function approach.
Sources:
- Wooldridge, J. M. (2014). Quasi-maximum likelihood estimation and testing for nonlinear models with endogenous explanatory variables. Journal of Econometrics, 182(1), 226-234. https://doi.org/https://doi.org/10.1...om.2014.04.020
- Wooldridge, J. M. (2015). Control Function Methods in Applied Econometrics. The Journal of Human Resources, 50(2), 420-445. http://www.jstor.org/stable/24735991
Comment