Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction effects in ivreg2

    Hi, I am trying to estimate whether the coefficient has changed between two periods using a date dummy (pre- and post-1975). The independent variable is endogenous so I need to use IV. I do not think my implementation is working correctly.

    If I do not use interaction effects, when I manually predict the endogenous variables in the first-stage and then apply the second-stage I get the same coefficients as I would from using ivreg2 directly. (Obviously the standard errors are not correct, but for here I am just checking the coeffiicents).

    But when I incorporate interactions my results are different. I think it is because the interactions are also being used in the first stage, but i only want to investigate the second stage - and leave the first-stage prediction unchanged.

    Here is a reconstructed example:

    Code:
    use     "http://www.stata-press.com/data/r13/wpi1.dta" , clear
    
    *Create endogenous variables 
    gen X1 = wpi*2 + runiform()*200
    gen X2 = wpi*0.5 + runiform()*500
    
    *Create exogenous variables
    gen Z1 = X1 + runiform()*200
    gen Z2 = X2 + runiform()*300
    
    *Create date dummy
    gen post1975 = t>tq(1975q1)
    
    *Predict first-stage 
    reg X1 Z1 Z2
    predict X1_pred
    reg X2 Z1 Z2
    predict X2_pred
    
    * Overall regression
    ivreg2 wpi (X1 X2 = Z1 Z2), first
    reg wpi X1_pred X2_pred
    ** Coefficients match, as expected
    
    * Interaction regression
    ivreg2 wpi (X1 X2 c.X1#i.post = Z1 Z2 c.Z1#i.post c.Z2#i.post) post1975
    reg wpi X1_pred X2_pred c.X1_pred#i.post1975 post1975
    ** Coefficients do not match. Why not?

  • #2
    You have a total of three first stage regressions. The endogenous interaction is a variable in its own right.

    Comment


    • #3
      Thanks for clarifying - that means I think I am estimating the wrong model.

      My target model is to
      a) in a first-stage predict X1 and X2 using Z1 and Z2
      b) regress y on the predicted X1 and X2
      c) test for a difference in second-stage coefficient on X1, holding the coefficient on X2 constant.

      In a non-IV setting, I would just use an interaction, but what we discussed above doesn't seem to apply in a straight-forward way here. Is there a better way to code up this model.

      Code:
       
       reg wpi X1_pred X2_pred c.X1_pred#i.post1975 post1975
      These are the correct coefficients, but doing it manually means that my inference is wrong. Is there a way to get the desired coeffients using ivreg2 where I can obtain the correct standard errors?

      Comment


      • #4
        This is a linear model. So the coefficients on X1 and X2 in the IV regression are marginal effects, i.e. they express how much the outcome is expected to change for a unit change in the predictor, holding the other predictor constant. Or what kind of test do you have in mind?

        Comment


        • #5
          #3 was not clear, but rereading #1, I see what you want to do. Here is how you can specify the IV with interactions to test the difference in the estimated coefficient over the two sample periods. Using ## instead of # will give you the differences directly as coefficients without the need to use the test command, but this way is more intuitive.

          Code:
          use "http://www.stata-press.com/data/r13/wpi1.dta" , clear
          
          *Create endogenous variables 
          gen X1 = wpi*2 + runiform()*200
          gen X2 = wpi*0.5 + runiform()*500
          
          *Create exogenous variables
          gen Z1 = X1 + runiform()*200
          gen Z2 = X2 + runiform()*300
          
          *Create date dummy
          gen post1975 = t>tq(1975q1)
          
          ivreg2 wpi (X1 X2 = Z1 Z2) if !post1975, robust
          ivreg2 wpi (X1 X2 = Z1 Z2) if post1975, robust
          gen cons=1
          ivreg2 wpi (i.post1975#(c.X1 c.X2) = i.post1975#(c.Z1 c.Z2)) i.post1975#c.cons, nocons robust 
          test 0.post1975#c.X1= 1.post1975#c.X1
          Res.:

          Code:
          . 
          . ivreg2 wpi (X1 X2 = Z1 Z2) if !post1975, robust
          
          IV (2SLS) estimation
          --------------------
          
          Estimates efficient for homoskedasticity only
          Statistics robust to heteroskedasticity
          
                                                                Number of obs =       61
                                                                F(  2,    58) =     1.07
                                                                Prob > F      =   0.3495
          Total (centered) SS     =  2457.769137                Centered R2   =   0.0442
          Total (uncentered) SS   =  78222.36125                Uncentered R2 =   0.9700
          Residual SS             =  2349.173792                Root MSE      =    6.206
          
          ------------------------------------------------------------------------------
                       |               Robust
                   wpi | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                    X1 |   .0218851   .0179232     1.22   0.222    -.0132437    .0570139
                    X2 |   .0073444   .0072797     1.01   0.313    -.0069237    .0216124
                 _cons |   29.40602   3.990646     7.37   0.000      21.5845    37.22755
          ------------------------------------------------------------------------------
          Underidentification test (Kleibergen-Paap rk LM statistic):             20.840
                                                             Chi-sq(1) P-val =    0.0000
          ------------------------------------------------------------------------------
          Weak identification test (Cragg-Donald Wald F statistic):               21.927
                                   (Kleibergen-Paap rk Wald F statistic):         26.731
          Stock-Yogo weak ID test critical values: 10% maximal IV size              7.03
                                                   15% maximal IV size              4.58
                                                   20% maximal IV size              3.95
                                                   25% maximal IV size              3.63
          Source: Stock-Yogo (2005).  Reproduced by permission.
          NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
          ------------------------------------------------------------------------------
          Hansen J statistic (overidentification test of all instruments):         0.000
                                                           (equation exactly identified)
          ------------------------------------------------------------------------------
          Instrumented:         X1 X2
          Excluded instruments: Z1 Z2
          ------------------------------------------------------------------------------
          
          . 
          . ivreg2 wpi (X1 X2 = Z1 Z2) if post1975, robust
          
          IV (2SLS) estimation
          --------------------
          
          Estimates efficient for homoskedasticity only
          Statistics robust to heteroskedasticity
          
                                                                Number of obs =       63
                                                                F(  2,    60) =    20.47
                                                                Prob > F      =   0.0000
          Total (centered) SS     =  19040.39622                Centered R2   =   0.3260
          Total (uncentered) SS   =  522916.7378                Uncentered R2 =   0.9755
          Residual SS             =   12832.5333                Root MSE      =    14.27
          
          ------------------------------------------------------------------------------
                       |               Robust
                   wpi | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          -------------+----------------------------------------------------------------
                    X1 |   .1427594   .0230702     6.19   0.000     .0975427    .1879761
                    X2 |   .0206721   .0137092     1.51   0.132    -.0061975    .0475417
                 _cons |   43.66184   7.916588     5.52   0.000     28.14562    59.17807
          ------------------------------------------------------------------------------
          Underidentification test (Kleibergen-Paap rk LM statistic):             22.578
                                                             Chi-sq(1) P-val =    0.0000
          ------------------------------------------------------------------------------
          Weak identification test (Cragg-Donald Wald F statistic):               47.073
                                   (Kleibergen-Paap rk Wald F statistic):         88.684
          Stock-Yogo weak ID test critical values: 10% maximal IV size              7.03
                                                   15% maximal IV size              4.58
                                                   20% maximal IV size              3.95
                                                   25% maximal IV size              3.63
          Source: Stock-Yogo (2005).  Reproduced by permission.
          NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
          ------------------------------------------------------------------------------
          Hansen J statistic (overidentification test of all instruments):         0.000
                                                           (equation exactly identified)
          ------------------------------------------------------------------------------
          Instrumented:         X1 X2
          Excluded instruments: Z1 Z2
          ------------------------------------------------------------------------------
          
          . 
          . gen cons=1
          
          . 
          . ivreg2 wpi (i.post1975#(c.X1 c.X2) = i.post1975#(c.Z1 c.Z2)) i.post1975#c.cons, nocons robust 
          
          IV (2SLS) estimation
          --------------------
          
          Estimates efficient for homoskedasticity only
          Statistics robust to heteroskedasticity
          
                                                                Number of obs =      124
                                                                F(  6,   118) =   766.94
                                                                Prob > F      =   0.0000
          Total (centered) SS     =  112504.7755                Centered R2   =   0.8651
          Total (uncentered) SS   =   601139.099                Uncentered R2 =   0.9747
          Residual SS             =  15181.70709                Root MSE      =    11.06
          
          ---------------------------------------------------------------------------------
                          |               Robust
                      wpi | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
          ----------------+----------------------------------------------------------------
            post1975#c.X1 |
                       0  |   .0218851   .0179232     1.22   0.222    -.0132437    .0570139
                       1  |   .1427594   .0230702     6.19   0.000     .0975427    .1879761
                          |
            post1975#c.X2 |
                       0  |   .0073444   .0072797     1.01   0.313    -.0069237    .0216124
                       1  |   .0206721   .0137092     1.51   0.132    -.0061975    .0475417
                          |
          post1975#c.cons |
                       0  |   29.40602   3.990646     7.37   0.000      21.5845    37.22755
                       1  |   43.66184   7.916588     5.52   0.000     28.14562    59.17807
          ---------------------------------------------------------------------------------
          Underidentification test (Kleibergen-Paap rk LM statistic):              0.000
                                                             Chi-sq(1) P-val =    1.0000
          ------------------------------------------------------------------------------
          Weak identification test (Cragg-Donald Wald F statistic):               22.306
                                   (Kleibergen-Paap rk Wald F statistic):          0.000
          Stock-Yogo weak ID test critical values:                       <not available>
          ------------------------------------------------------------------------------
          Hansen J statistic (overidentification test of all instruments):         0.000
                                                           (equation exactly identified)
          ------------------------------------------------------------------------------
          Instrumented:         0b.post1975#c.X1 1.post1975#c.X1 0b.post1975#c.X2
                                1.post1975#c.X2
          Included instruments: 0b.post1975#c.cons 1.post1975#c.cons
          Excluded instruments: 0b.post1975#c.Z1 1.post1975#c.Z1 0b.post1975#c.Z2
                                1.post1975#c.Z2
          ------------------------------------------------------------------------------
          
          . 
          . test 0.post1975#c.X1= 1.post1975#c.X1
          
           ( 1)  0b.post1975#c.X1 - 1.post1975#c.X1 = 0
          
                     chi2(  1) =   17.12
                   Prob > chi2 =    0.0000
          
          .

          Comment

          Working...
          X