Dear Statalists,
I am dealing with a binary endogenous variable and thus trying to use a 2SLS model with Probit correction in the preliminary ("0th") stage, a procedure described in Wooldridge (2002; p.623, procedure 18.1).
Below I describe my question in detail using an example (please note that I have cross-posted this question on Cross Validated at: https://stats.stackexchange.com/ques...ditional-fixed).
I have a yearly panel dataset of firms, and half of the firms in the sample are treated. The treatment status does not change over time within firm (i.e., the sample consists of only never-treated and always-treated ones), so firm fixed effects cannot be included in regression models. The dependent variable is profit that a firm made in a given year. I am interested in estimating the impact of the treatment on firm profit. Industry (e.g., 4 digit SIC of firm) and year fixed effects are included, along with some other covariates.
Assume that I have a "good" instrumental variable (IV) that satisfies both relevance and exclusion restrictions. But this IV is time-invariant within industry, that is, the values of IV is the same for all firms that belong to the same industry (i.e., those that have the same SIC) regardless of time. This means that, in a (conventional) 2SLS approach, industry fixed effects cannot be included in the first stage (because IV will be absorbed by industry fixed effects). My understanding is that the same covariates and fixed effects should be used in both first and second stages of 2SLS; if correct, industry fixed effects should not be included in the second stage of 2SLS either (this seems to be the case when I look at Stata commands like ivreg or ivreghdfe.
Now, consider a 2SLS with 0th-stage Probit correction approach. Under this approach, the treatment is regressed on IV (and on other covariates and fixed effects) using a Probit model in the 0th stage. For the same reason described above (i.e., IV is time-invariant within industry), industry fixed effects cannot be included in this 0th-stage Probit model. The predicted probabilities of getting treated, estimated from this Probit model, is then considered as a (new) IV, and the new IV is now used in the subsequent 2SLS models, just like in a conventional 2SLS approach.
So, my question is whether it would be "wrong" to include industry fixed effects in the 2SLS part of this "2SLS with Probit correction" approach. Now that I have a "better" IV (i.e., predicted probabilities of getting treated based the Probit model) that does vary over time within industry, technically I can include industry fixed effects in the first stage (and also in the second stage) of 2SLS. The new IV is also strongly relevant with the treatment (i.e., the Cragg-Donald Wald F statistic is sufficiently large in the first stage of 2SLS with industry fixed effects).
I can't think of any obvious reasons why it would be wrong to include industry fixed effects in the 2SLS part of this approach (i.e., 2SLS with Probit correction), but I am not sure about statistical/econometrical implications of doing it. I have looked at some studies that used this approach (e.g., Adams et al. 2009, Cameron et al. 1988, Dubin and McFadden 1984) and they don't seem to emphasize that the same set of fixed effects should be included in all three models (i.e., Probit and 2SLS).
I would greatly appreciate your inputs.
References:
- Adams, R., Almeida, H. and Ferreira, D., 2009. Understanding the relationship between founder–CEOs and firm performance. Journal of empirical Finance, 16(1), pp.136-150.
- Cameron, A.C., Trivedi, P.K., Milne, F. and Piggott, J., 1988. A microeconometric model of the demand for health care and health insurance in Australia. The Review of economic studies, 55(1), pp.85-106.
- Dubin, J.A. and McFadden, D.L., 1984. An econometric analysis of residential electric appliance holdings and consumption. Econometrica: Journal of the Econometric Society, pp.345-362.
- Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, MA.
I am dealing with a binary endogenous variable and thus trying to use a 2SLS model with Probit correction in the preliminary ("0th") stage, a procedure described in Wooldridge (2002; p.623, procedure 18.1).
Below I describe my question in detail using an example (please note that I have cross-posted this question on Cross Validated at: https://stats.stackexchange.com/ques...ditional-fixed).
I have a yearly panel dataset of firms, and half of the firms in the sample are treated. The treatment status does not change over time within firm (i.e., the sample consists of only never-treated and always-treated ones), so firm fixed effects cannot be included in regression models. The dependent variable is profit that a firm made in a given year. I am interested in estimating the impact of the treatment on firm profit. Industry (e.g., 4 digit SIC of firm) and year fixed effects are included, along with some other covariates.
Assume that I have a "good" instrumental variable (IV) that satisfies both relevance and exclusion restrictions. But this IV is time-invariant within industry, that is, the values of IV is the same for all firms that belong to the same industry (i.e., those that have the same SIC) regardless of time. This means that, in a (conventional) 2SLS approach, industry fixed effects cannot be included in the first stage (because IV will be absorbed by industry fixed effects). My understanding is that the same covariates and fixed effects should be used in both first and second stages of 2SLS; if correct, industry fixed effects should not be included in the second stage of 2SLS either (this seems to be the case when I look at Stata commands like ivreg or ivreghdfe.
Now, consider a 2SLS with 0th-stage Probit correction approach. Under this approach, the treatment is regressed on IV (and on other covariates and fixed effects) using a Probit model in the 0th stage. For the same reason described above (i.e., IV is time-invariant within industry), industry fixed effects cannot be included in this 0th-stage Probit model. The predicted probabilities of getting treated, estimated from this Probit model, is then considered as a (new) IV, and the new IV is now used in the subsequent 2SLS models, just like in a conventional 2SLS approach.
So, my question is whether it would be "wrong" to include industry fixed effects in the 2SLS part of this "2SLS with Probit correction" approach. Now that I have a "better" IV (i.e., predicted probabilities of getting treated based the Probit model) that does vary over time within industry, technically I can include industry fixed effects in the first stage (and also in the second stage) of 2SLS. The new IV is also strongly relevant with the treatment (i.e., the Cragg-Donald Wald F statistic is sufficiently large in the first stage of 2SLS with industry fixed effects).
I can't think of any obvious reasons why it would be wrong to include industry fixed effects in the 2SLS part of this approach (i.e., 2SLS with Probit correction), but I am not sure about statistical/econometrical implications of doing it. I have looked at some studies that used this approach (e.g., Adams et al. 2009, Cameron et al. 1988, Dubin and McFadden 1984) and they don't seem to emphasize that the same set of fixed effects should be included in all three models (i.e., Probit and 2SLS).
I would greatly appreciate your inputs.
References:
- Adams, R., Almeida, H. and Ferreira, D., 2009. Understanding the relationship between founder–CEOs and firm performance. Journal of empirical Finance, 16(1), pp.136-150.
- Cameron, A.C., Trivedi, P.K., Milne, F. and Piggott, J., 1988. A microeconometric model of the demand for health care and health insurance in Australia. The Review of economic studies, 55(1), pp.85-106.
- Dubin, J.A. and McFadden, D.L., 1984. An econometric analysis of residential electric appliance holdings and consumption. Econometrica: Journal of the Econometric Society, pp.345-362.
- Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, MA.