Coefficient changes sign and blows up after 2-stage residual inclusion/control funcion

Soeren Dallmeyer

Join Date: Oct 2015

Posts: 9
#1

Coefficient changes sign and blows up after 2-stage residual inclusion/control funcion

22 May 2018, 05:01

Hi,

I have a cardinal dependent variable Y and a endogenous, binary, independent variable X and numerous control variables C.
If I run a standard OLS model, I get a significant, positive coefficient of 0.02 for X.
Now, I am implementing a 2-stage residual inclusion model where my first stage is a probit model X=(C, Z) with Z being my instrument.
Since I want to use the residuals from this stage I estimated the model as follows:

glm X C Z, fam(bin) link(probit)
predict Xhat, response

and then used the residuals in my original ols model:

reg Y X C Z Xhat

Now, I am getting a significant, negative coefficient for X of -4,25 and for the residual Xhat +4.49.

Can someone explain this drastic change of the coefficient?

Thanks a lot for your help.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

23 May 2018, 12:16

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

If you have an endogenous binary variable but are treating the outcome as continuous, 2sls is generally consistent. If you want to, you could use ivregress, ivreg2, cmp, gsem, or eregress to model this more directly. I'm not sure about control function approaches, but in many such estimators, you need something in the equation for the endogenous variable that doesn't appear in the outcome equation. I suspect the second equation is only identified by the non-linearity in the glm which is not a great way to do identification.

I looked at this with this code:

Code:

clear set obs 100 g c=rnormal() g z=rnormal() g x=(rnormal() + c + z)>0 g y=x + c + z + rnormal() glm x c z, fam(bin) link(probit) predict Xhat, response reg y x c z Xhat reg y x c z
Comment

Announcement

Coefficient changes sign and blows up after 2-stage residual inclusion/control funcion

Comment