Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman two-stage correction for self-selection in continuous endogenous variable

    Dear All:

    I have a regression with a potentially endogenous variable resulting from self-selection (Clougherty et al. 2016). I'd like to address this problem with a Heckman two-stage correction.

    Now, what makes my case a bit different from a Heckman sample selection model is that the endogenous variable is continuous (Garen, 1984). The Heckman command in Stata requires a sample selection dummy so this is out.
    1. Can I manually calculate an OLS model as first stage and include a Lambda variable in the second stage?
    2. Would this result in a forbidden regression as it would for an instrumental variable regression that is manually calculated?
    3. Is there a user-written or built-in Stata program that could implement a Heckman correction with continuous endogenous variable?
    Thanks so much!
    Henry

    Reference:
    Clougherty JA, Duso T, Muck J. 2016. Correcting for self-selection based endogeneity in management research: Review, recommendations and simulations. Organizational Research Methods 19(2): 286–347.
    Garen J. 1984. The returns to schooling: A selectivity bias approach with a continuous choice variable. Econometrica 52(5): 1199–1218.

  • #2
    Hi everyone. I'm just checking in to see if anyone had experience with this issue. Thanks for the feedback!

    Comment


    • #3
      I don't fully understand your setup. Do you mean you have a continuous "treatment" variable that you want to think of as endogenous? In Garen's example, the continuous (well, not really, but he treats it as such) treatment variable is schooling. The self selection into schooling causes it to be endogenous in a standard wage equation.

      More generally, let y1 be the response and y2 the continuous treatment that is subject to self selection. If y2 appears with a constant coefficient, just use 2SLS. The method that includes a "lambda" variable is called a control function approach, but it just produces 2SLS in this case. The lambda is just the reduced form residuals for y2. Garen's method is for when the effect of y2 is heterogeneous, and that leads to an interaction term between y2 and the reduced form residuals.

      Garen's method in Stata with M < L (need an excluded exogenous variable for the instrument):

      Code:
      reg y2 z1 z2 ... zL
      predict v2hat, resid
      reg y1 z1 ... zM y2 v2hat c.y2#c.v2hat, robust
      If you omit the interaction c.y2#c.v2hat you just get 2SLS. You can test jointly the significance of v2hat and the c.y2#c.v2hat to test the null of exogeneity of y2. You should bootstrap the two-step procedure to get proper standard errors, in general.

      I discuss these methods in more detail here: http://jhr.uwpress.org/content/50/2/420.full.pdf

      To get better help, tell us more about your problem and show us Stata code and output.

      JW

      Comment

      Working...
      X