Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • IV for interaction with a dummy: Warning: variance matrix is nonsymmetric or highly singular

    Dear Statalist,

    I am running ivregress for the regression with one endogeneous variable X and its interaction with an exogeneous dummy D (0 or 1):
    Y = b0 + b1 X + b2 X*D, where I instrument X and X*D with Z and Z*D.

    I find the estimated coefficients in the first stage logical, however my worry is in the standard errors in the first stage.
    1) If I don't include vce(robust) option, I get some p-values equal 1 in both first stage regressions;
    2) with vce(robust) see the output below;
    3) and with vce(cluster var) both first stage regressions lack estimation of standard errors and show "Warning: variance matrix is nonsymmetric or highly singular".


    Code:
    . ivregress 2sls Y (X c.X#D = Z c.Z#D) D, vce(robust) first
    
    First-stage regressions
    -----------------------
    
                                                    Number of obs     =      1,120
                                                    F(   3,   1116)   =     104.82
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.7058
                                                    Adj R-squared     =     0.7050
                                                    Root MSE          =     0.6680
    
    ------------------------------------------------------------------------------
                 |               Robust
               X |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               D |  -3.10e-15    .122954    -0.00   1.000     -.241247     .241247
               Z |   .8933804   .0712486    12.54   0.000      .753584    1.033177
                 |
           D#c.Z |
              1  |   2.10e-15   .1007608     0.00   1.000    -.1977019    .1977019
                 |
           _cons |   .1350366   .0869416     1.55   0.121    -.0355508     .305624
    ------------------------------------------------------------------------------
    Warning:  variance matrix is nonsymmetric or highly singular
    
                                                    Number of obs     =      1,120
                                                    F(   0,   1116)   =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.8272
                                                    Adj R-squared     =     0.8268
                                                    Root MSE          =     0.4723
    
    ------------------------------------------------------------------------------
                 |               Robust
         1.D#c.X |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               D |   .1350366          .        .       .            .           .
               Z |  -2.02e-15          .        .       .            .           .
                 |
           D#c.Z |
              1  |   .8933804          .        .       .            .           .
                 |
           _cons |   3.00e-15          .        .       .            .           .
    ------------------------------------------------------------------------------
    
    
    Instrumental variables (2SLS) regression          Number of obs   =      1,120
                                                      Wald chi2(3)    =      26.17
                                                      Prob > chi2     =     0.0000
                                                      R-squared       =     0.0212
                                                      Root MSE        =     .74611
    
    ------------------------------------------------------------------------------
                 |               Robust
               Y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               X |  -.0004001   .0336982    -0.01   0.991    -.0664473    .0656471
                 |
           D#c.X |
              1  |   .0269442    .049598     0.54   0.587     -.070266    .1241544
                 |
               D |   .1875806   .0797811     2.35   0.019     .0312125    .3439487
           _cons |  -.0700652   .0508648    -1.38   0.168    -.1697582    .0296279
    ------------------------------------------------------------------------------
    Instrumented:  X 1.D#c.X
    Instruments:   D Z 1.D#c.Z

    I couldn't come up with explanation for this issue with standard errors, and I am not sure if I can use these 2SLS estimates in the end.

    Would be very much thankful for any insight.
    Last edited by Nadiia Lazhevska; 27 Jul 2017, 04:11. Reason: ivregress

  • #2
    I would start by running the first stage on the same sample (i.e., after running the 2sls, generate a variable use=e(sample) and do the rest with if use==1) with regress (ols) with both X and D*X.. Then you have all the regress diagnostics available including estat vif. You can also look at the means and correlation matrix of your rhs variables. With relatively high R-square and such parameter values, it looks like you have some odd colinearity problem. D might take either 0 or 1 almost all the time or X have almost no variability.

    Comment


    • #3
      Dear Phil, thank you for your suggestions!
      It seems that colinearity is not a problem as is shown by estat vif, and there is enough variation in both D and X variables.
      All the results below use new sample as I didn't save the original example from the post.

      Code:
      . ivregress 2sls Y (X c.X#D = Z c.Z#D) D,  vce(robust) first
      
      First-stage regressions
      -----------------------
      
                                                      Number of obs     =      1,550
                                                      F(   3,   1546)   =     198.49
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.7474
                                                      Adj R-squared     =     0.7470
                                                      Root MSE          =     0.4775
      
      ------------------------------------------------------------------------------
                   |               Robust
                 X |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 D |   .0074563   .0761402     0.10   0.922    -.1418926    .1568052
                 Z |   1.182626   .0730808    16.18   0.000     1.039278    1.325974
                   |
             D#c.Z |
                1  |  -.0048653   .0974682    -0.05   0.960    -.1960492    .1863185
                   |
             _cons |   .0588788    .056983     1.03   0.302    -.0528934    .1706509
      ------------------------------------------------------------------------------
      Warning:  variance matrix is nonsymmetric or highly singular
      
                                                      Number of obs     =      1,550
                                                      F(   0,   1546)   =          .
                                                      Prob > F          =          .
                                                      R-squared         =     0.8504
                                                      Adj R-squared     =     0.8501
                                                      Root MSE          =     0.3538
      
      ------------------------------------------------------------------------------
                   |               Robust
           1.D#c.X |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 D |   .0663351          .        .       .            .           .
                 Z |  -5.06e-15          .        .       .            .           .
                   |
             D#c.Z |
                1  |   1.177761          .        .       .            .           .
                   |
             _cons |   6.99e-15          .        .       .            .           .
      ------------------------------------------------------------------------------
      
      
      Instrumental variables (2SLS) regression          Number of obs   =      1,550
                                                        Wald chi2(3)    =      22.91
                                                        Prob > chi2     =     0.0000
                                                        R-squared       =     0.0161
                                                        Root MSE        =     .70601
      
      ------------------------------------------------------------------------------
                   |               Robust
                 Y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 X |   .0091762   .0350755     0.26   0.794    -.0595705    .0779228
                   |
             D#c.X |
                1  |  -.0194535   .0435971    -0.45   0.655    -.1049023    .0659953
                   |
                 D |  -.1573322   .0615256    -2.56   0.011    -.2779202   -.0367442
             _cons |   .1615124   .0513261     3.15   0.002     .0609151    .2621097
      ------------------------------------------------------------------------------
      Instrumented:  X 1.D#c.X
      Instruments:   D Z 1.D#c.Z
      
      
      . gen use=e(sample)
      
      . tab use
      
              use |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                1 |      1,550      100.00      100.00
      ------------+-----------------------------------
            Total |      1,550      100.00
      
      
      . gen XD = D*X
      
      
      . regress XD i.D Z i.D#c.Z, vce(robust)
      
      Linear regression                               Number of obs     =      1,550
                                                      F(2, 1546)        =          .
                                                      Prob > F          =          .
                                                      R-squared         =     0.8504
                                                      Root MSE          =     .35378
      
      ------------------------------------------------------------------------------
                   |               Robust
                XD |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 D |
                1  |   .0663351   .0504838     1.31   0.189    -.0326888     .165359
                 Z |  -4.28e-15   7.85e-10    -0.00   1.000    -1.54e-09    1.54e-09
                   |
             D#c.Z |
                1  |   1.177761   .0644713    18.27   0.000       1.0513    1.304221
                   |
             _cons |   5.77e-15   9.48e-10     0.00   1.000    -1.86e-09    1.86e-09
      ------------------------------------------------------------------------------
      
      
      . estat vif
      
          Variable |       VIF       1/VIF  
      -------------+----------------------
               1.D |      2.83    0.352743
                 Z |      2.21    0.452612
             D#c.Z |
                1  |      4.05    0.247174
      -------------+----------------------
          Mean VIF |      3.03
      
      . estat vif, uncentered
      
          Variable |       VIF       1/VIF  
      -------------+----------------------
               1.D |      6.30    0.158848
                 Z |      6.27    0.159549
             D#c.Z |
                1  |      6.29    0.159053
         intercept |      6.28    0.159343
      -------------+----------------------
          Mean VIF |      6.28
      
      
      . tab D
      
                D |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |        698       45.03       45.03
                1 |        852       54.97      100.00
      ------------+-----------------------------------
            Total |      1,550      100.00
      
      . sum X
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
                 X |      1,550    1.174888    .9492785  -.1076302   8.594627
      I suspect that the problem here is conceptual rather than data-related.

      When running regress with the dependent variable being an interaction with a dummy 1.D#c.X , as in the first stage of IV, the coefficients on variables that are not interacted with dummy (Z and intercept) should be strictly equal to zero (analytically).

      Can regress omit standard errors because it can not identify these zeros precisely?

      Here is an extract from the data to have a feeling. The problematic first stage is then basically X*D = D Z Z*D

      Code:
      Y    X    Z    D    X*D    Z*D
      -.1065543    1.385382    1.09914    0    0    0
      -.4683331    1.385382    1.09914    1    1.385382    1.09914
      .2289454    1.301095    1.174958    0    0    0
      -.709859    1.301095    1.174958    1    1.301095    1.174958
      -.2006604    .8291737    .5841724    0    0    0
      -.2318316    .8291737    .5841724    1    .8291737    .5841724


      Moreover, the estimates for Z and intercept vary depending on whether I specify dummy D or dummy i.D (whereas I believe that when D in {0,1}, the dummy D is equivalent to i.D in Stata syntax):

      Code:
      . regress XD i.D Z i.D#c.Z, vce(robust)
      
      Linear regression                               Number of obs     =      1,550
                                                      F(2, 1546)        =          .
                                                      Prob > F          =          .
                                                      R-squared         =     0.8504
                                                      Root MSE          =     .35378
      
      ------------------------------------------------------------------------------
                   |               Robust
                XD |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 D |
                1  |   .0663351   .0504838     1.31   0.189    -.0326888     .165359
                 Z |  -4.28e-15   7.85e-10    -0.00   1.000    -1.54e-09    1.54e-09
                   |
             D#c.Z |
                1  |   1.177761   .0644713    18.27   0.000       1.0513    1.304221
                   |
             _cons |   5.77e-15   9.48e-10     0.00   1.000    -1.86e-09    1.86e-09
      ------------------------------------------------------------------------------
      
      . regress XD D Z D#c.Z, vce(robust)
      
      Linear regression                               Number of obs     =      1,550
                                                      F(2, 1546)        =          .
                                                      Prob > F          =          .
                                                      R-squared         =     0.8504
                                                      Root MSE          =     .35378
      
      ------------------------------------------------------------------------------
                   |               Robust
                XD |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                 D |   .0663351   .0504838     1.31   0.189    -.0326888     .165359
                 Z |  -5.06e-15          .        .       .            .           .
                   |
             D#c.Z |
                1  |   1.177761   .0644713    18.27   0.000       1.0513    1.304221
                   |
             _cons |   6.99e-15          .        .       .            .           .
      ------------------------------------------------------------------------------
      Last edited by Nadiia Lazhevska; 01 Aug 2017, 14:42.

      Comment

      Working...
      X