Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2SLS with endogenous variable: Z=fitted values from 1st stage + instrument?

    Dear Statalist,

    When having an endogenous binary variable, we get more precise estimates when employing a nonlinear first stage and using its fitted value as an instrument in the second stage (e.g., Wooldridge, 2010).

    What I still do not quite understand is whether we can also include the original instrument as an instrument along the fitted value.

    Let's say we have:

    y1 = endogenous variable
    z = instrument for y1
    y2 = dependent variable of interest
    x = covariates

    First stage: y1 = beta_0 + beta_1 * z + beta_2 * x + u1
    --> obtain fitted probability of y1 = hat{y1}

    Second stage: y2 = alpha_0 + alpha_1 * hat{y1} + alpha_2 * x + u2

    In Stata, we would write:

    Code:
    logit y1 z x, robust
    predict y1_hat
    ivreg2 y2 x (y1 = y1_hat), robust first endog(y1)
    Could we also plug in z for an additional instrument for y1 on the second stage?

    Code:
    ivreg2 y2 x (y1 = z y1_hat), robust first endog(y1)

    Thanks!

  • #2
    Read this and the cited sections of the book:
    HTML Code:
    https://www.mostlyharmlesseconometrics.com/2009/07/is-2sls-really-ok/
    you'd exclude z from the ivreg2 command, but there are issues with this approach.

    A-P recommend just using ivreg2 without modification.

    Also see
    HTML Code:
      https://www.stata.com/meeting/mexico13/abstracts/materials/mex13_baum.pdf
    Which also offers ivreg2 without modification.

    Comment


    • #3
      Thanks George! Yes, I will do both -- (1) run the regression with an instrument as the fitted value from the first stage nonlinear model and (2) "garden-variety" 2SLS.

      Comment


      • #4
        Dear Kerstin Schmidt,

        Please note that each instrument may identify a different local parameter. You can test if that is the case by running 2SLS with both instruments, as you suggested in #1, and performing the J-test.

        Best wishes,

        Joao

        Comment


        • #5
          Hello,
          If I have more than one endogenous variable say 3 or 4 (v1, v2, v3, v4), should I rely on my theoretical understanding or on what the endog command empirically suggest for the selection of endogenous variable and then choose the endogenous variable? Moreover, is individual significance required endog(V1) separately or the significance of endog(v1 v2 v3 v4 ) is enough?

          Thank you for your suggestions

          Comment


          • #6
            Dear @Joao Santos Silva,

            Thank you!
            Would it really make sense to include z in the structural equation as:

            Code:
            logit y1 z x, robust  
             predict y1_hat  
             ivreg2 y2 x (y1 = z y1_hat), robust first endog(y1)
            ?

            The more I read about it, the less sure I am... Haven't found this anywhere.

            Comment


            • #7
              Dear Kerstin Schmidt,

              Yes, that is essentially what we did in (see the top of page 291):

              Windmeijer, F.A.G. and Santos Silva, J.M.C. (1997), Estimation of Count Data Models with Endogenous regressors; An Application to Demand for Health Care, Journal of Applied Econometrics, 12(3), pp. 281-294.

              Best wishes,

              Joao

              Comment

              Working...
              X