Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Jeff Wooldridge View Post
    I read a bit too quickly. Because you regress the interactions on the nonlinear functions it is not the same as the "forbidden regression." But the better way is to use X1hat*X3 as the IV for X1*X3 and X2hat*X3 as the IV for X2*X3. And you shouldn't have exclusion restrictions in your first stages.

    Code:
    reg X1 Z1 Z2
    predict X1hat
    reg X2 Z1 Z2
    predict X2hat
    ivregress 2sls Y X3 (X1 X2 c.X1#c.X3 c.X2#c.X3 = Z1 Z2 c.X1hat#c.X3 c.X2hat#c.X3, vce(r)
    Thank you, Prof. Wooldridge. Here Z1 is the instrument for X1 and Z2 for X2. Should we not keep just the relevant instrument instead of both the instruments when predicting X1hat and X2hat?

    Also, could you please suggest a reference that we could cite while using this approach for interaction terms with endogenous variables? Thanks.

    Comment


    • #17
      In linear regression, if a RHS variable is truly "irrelevant", its coefficient is zero. Whether you include or exclude it from the regression does not affect the coefficients of the other regressors. While the degrees of freedom may differ with inclusion or exclusion, this is negligible when dealing with a few regressors and a sufficiently large sample size. I recall a discussion of this method in Jeff's MIT Press book, so you can find the relevant chapter and cite it.

      Comment


      • #18
        I have a binary outcome (y1) with two endogenous variables (x1, x2) and I want to interact x1 with x3 (exogenous) and x2 with x4 (exogenous). I also have two instruments (z1, z2) and controls (c).

        The reduced form regression results seem to provide meaningful coefficients. However, the coefficient of x2 in the second stage (after instrumenting) is so high (greater than 1, -1.49 to be precise). I run the following regression: ivregress 2sls y1 x3 x4 c (x1 x2 x1#x3 x2#x4 = z1 z2 z1#x3 z2#x4), r.

        I have checked all the IV diagnostics and they seem fine. Any idea why this might be happening? Thank you.

        Comment


        • #19
          Tekalign: Without knowing details, my guess is you didn't center all variables about their means before creating the interactions. But it could also relate to the units of measurement of x2.

          Comment


          • #20
            Thank you, Prof. Wooldridge. Yes, I have not centered the variables. Regarding the unit of measurement, both X1 and X2 are dummies, while X3 and X4 are continuous variables. All the controls (c) are at their baseline (for the survey of four rounds).

            I just also tried after centering all the vars around their means and the coefficient of the main effect of X2 (dummy) is still high, 1.47.
            Last edited by Tekalign Gutu; 18 May 2024, 14:53.

            Comment


            • #21
              I also tried after centering all the vars around their means and the coefficient of the main effect of X2 (dummy) is still high, 1.47.

              Comment


              • #22
                Dear Professor,
                I am currently struggling with a similar problem, but in the context of a panel data model (FE regression). I am writing down the code for your reference.

                xtivreg y x1 x2 controls (x1 c.x1#x2= z c.z1#x2), fe first

                y (continuous) is the dependent variable, x1 (continuous) is the endogenous independent variable and x2 (dummy) is exogenous. z (continuous) is the instrument for x1. In my model, I need to interact x1 with x2. If I execute the above code, I get an error 'depvars may not be interactions r(198);'.

                I have the following queries:
                a) are interactions (#) not allowed with xtivreg?
                b) Is it correct to replace x1 in the above code with predicted x1 (x1hat), that will be obtained from a previous regression of x1 on z and other controls?
                c) I want to know which variables do we need to center about the means (the endogenous x1 or all independent variables)? Given that x1 is a continuous variable, I have this confusion.

                Thanks,
                Palla

                Comment


                • #23
                  Unfortunately, xtivreg doesn't allow you to specify interactions as endogenous explanatory variables. This should be fixed in the future, as it's natural and it's already allowed in ivregress 2sls.

                  But it's an easy fix. Generate the interactions and then include them.

                  Code:
                  sum x1
                  gen x1_dm = x1 - r(mean)
                  gen x1_dmx2 = x1_dm*x2
                  sum z
                  gen z_dm = z - r(mean)
                  gen z_dmx2 = z_dm*x2
                  xtivreg y x1 x2 controls (x1 x1_dmx2 = z z_dmx2), fe first
                  It's not necessary to demean the IV z, but it will make the coefficients in the first stages more meaningful. Hopefully you can confirm that z_dmx2 is more important than z in the first stage for x1_dmx2.

                  Comment


                  • #24
                    Dear Professor,
                    Many thanks for clarifying my doubts. I followed these steps and here is an update on the results in brief. So, I have obtained two first stage regression results (x1 and x1_dmx2), where x1 is significantly correlated with z and not with z_dmx2. Similarly, a strong correlation is found between x1_dmx2 and z_dmx2. I hope this is what is required for instrument relevance (a significant correlation with its own instrument). I had to drop x1 from the second stage(outcome model) and keep it inside the () only, as it generated missing standard errors against all variables. This is what is permitted in the syntax, I guess.

                    Comment

                    Working...
                    X