Heckman two-stage correction for self-selection in continuous endogenous variable

Henry Kohlsdorf

Join Date: Feb 2017

Posts: 3
#1

Heckman two-stage correction for self-selection in continuous endogenous variable

02 Feb 2017, 23:55

Dear All:

I have a regression with a potentially endogenous variable resulting from self-selection (Clougherty et al. 2016). I'd like to address this problem with a Heckman two-stage correction.

Now, what makes my case a bit different from a Heckman sample selection model is that the endogenous variable is continuous (Garen, 1984). The Heckman command in Stata requires a sample selection dummy so this is out.
Can I manually calculate an OLS model as first stage and include a Lambda variable in the second stage?

Would this result in a forbidden regression as it would for an instrumental variable regression that is manually calculated?

Is there a user-written or built-in Stata program that could implement a Heckman correction with continuous endogenous variable?

Thanks so much!
Henry

Reference:
Clougherty JA, Duso T, Muck J. 2016. Correcting for self-selection based endogeneity in management research: Review, recommendations and simulations. Organizational Research Methods 19(2): 286–347.
Garen J. 1984. The returns to schooling: A selectivity bias approach with a continuous choice variable. Econometrica 52(5): 1199–1218.
Tags: None

1 like
Henry Kohlsdorf

Join Date: Feb 2017

Posts: 3
#2

08 Feb 2017, 06:34

Hi everyone. I'm just checking in to see if anyone had experience with this issue. Thanks for the feedback!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2159
#3

08 Feb 2017, 08:41

I don't fully understand your setup. Do you mean you have a continuous "treatment" variable that you want to think of as endogenous? In Garen's example, the continuous (well, not really, but he treats it as such) treatment variable is schooling. The self selection into schooling causes it to be endogenous in a standard wage equation.

More generally, let y1 be the response and y2 the continuous treatment that is subject to self selection. If y2 appears with a constant coefficient, just use 2SLS. The method that includes a "lambda" variable is called a control function approach, but it just produces 2SLS in this case. The lambda is just the reduced form residuals for y2. Garen's method is for when the effect of y2 is heterogeneous, and that leads to an interaction term between y2 and the reduced form residuals.

Garen's method in Stata with M < L (need an excluded exogenous variable for the instrument):

Code:

reg y2 z1 z2 ... zL predict v2hat, resid reg y1 z1 ... zM y2 v2hat c.y2#c.v2hat, robust

If you omit the interaction c.y2#c.v2hat you just get 2SLS. You can test jointly the significance of v2hat and the c.y2#c.v2hat to test the null of exogeneity of y2. You should bootstrap the two-step procedure to get proper standard errors, in general.

I discuss these methods in more detail here: http://jhr.uwpress.org/content/50/2/420.full.pdf

To get better help, tell us more about your problem and show us Stata code and output.

JW
Comment

Announcement

Heckman two-stage correction for self-selection in continuous endogenous variable

Comment

Comment