Effective F-stat different from KP F-stat when running ivreghdfe + weakivtest

Paula de Souza Leao Spinola

Join Date: Jun 2015
Posts: 384

Effective F-stat different from KP F-stat when running ivreghdfe + weakivtest

24 Jan 2023, 14:57

Because the postestimation command -weakivtest- works after -ivreg2- but not after -ivreghdfe-, I am using the following program to trick Stata into reading the output of ivreghdfe as ivreg2 so that -weakivtest- runs afterwards. I extracted this program from the replication package Code For: Peer Effect in Product Adoption (openicpsr.org). I am doing this because my real data is massive, and in my iv estimation I have 3 FE of thousands of categories which I am absorbing with -ivreghdfe-. Running -ivreg2- doesn't seem feasible in my context.

Code:

* Function that makes previously iv estimated regression with HD be ivreg2
prog pretend_to_be_ivreg2, eclass
    ereturn local cmd = "ivreg2"
end

I am estimating several specifications, some with a single IV and others with multiple IVs. I have managed to use -weakivtest- after -ivreghdfe- followed by -pretend_to_be_ivreg2- (function defined above). However, I am getting an Effective F-stat which is much lower than Kleibergen-Paap rk Wald F statistic in specifications where I have a single instrument. According to Andrews et al (2019), they should be the identical in the case of single IV (and single endogenous variable). I replicated this issue with the simple example below.

Code:

. sysuse auto, clear
(1978 automobile data)

. ivreghdfe price (weight=length), absorb(turn) cluster(turn) // coeff: 4.246995, SE: .8023496, Fstat: 84.916
(dropped 4 singleton observations)
(MWFE estimator converged in 1 iterations)

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on turn

Number of clusters (turn) =         14                Number of obs =       70
                                                      F(  1,    13) =    28.02
                                                      Prob > F      =   0.0001
Total (centered) SS     =  436283540.4                Centered R2   =   0.4359
Total (uncentered) SS   =  436283540.4                Uncentered R2 =   0.4359
Residual SS             =  246111883.9                Root MSE      =     1902

------------------------------------------------------------------------------
             |               Robust
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |   4.246995   .8023496     5.29   0.000     2.513624    5.980366
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                   Chi-sq(1) P-val =    0.0136
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               82.964
                         (Kleibergen-Paap rk Wald F statistic):         84.916
Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                         15% maximal IV size              8.96
                                         20% maximal IV size              6.66
                                         25% maximal IV size              5.53
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         0.000
                                                 (equation exactly identified)
------------------------------------------------------------------------------
Instrumented:         weight
Excluded instruments: length
Partialled-out:       _cons
                      nb: total SS, model F and R2s are after partialling-out;
                          any small-sample adjustments include partialled-out
                          variables in regressor count K
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
        turn |        14          14           0    *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation

. pretend_to_be_ivreg2

. weakivtest // 16.256
(obs=70)

Montiel-Pflueger robust weak instrument test
--------------------------------------------
Effective F statistic:       16.256
Confidence level alpha:          5%
--------------------------------------------

--------------------------------------------
Critical Values             TSLS      LIML
--------------------------------------------
% of Worst Case Bias
tau=5%                    37.418    37.418
tau=10%                   23.109    23.109
tau=20%                   15.062    15.062
tau=30%                   12.039    12.039
--------------------------------------------

.
end of do-file

.

When I run the same regression with ivreg2 I get same values for the effective F-stat and KP F-stat, as expected.

Code:

. sysuse auto, clear
(1978 automobile data)

.
. ivreg2 price (weight=length) i.turn, cluster(turn) small // coef: 4.246995, SE: .9098988, KP Fstat: 66.028

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on turn

Number of clusters (turn) =         18                Number of obs =       74
                                                      F( 18,    17) =     0.00
                                                      Prob > F      =   1.0000
Total (centered) SS     =  635065396.1                Centered R2   =   0.6125
Total (uncentered) SS   =   3447834321                Uncentered R2 =   0.9286
Residual SS             =  246111883.9                Root MSE      =     2115

------------------------------------------------------------------------------
             |               Robust
       price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |   4.246995   .9098988     4.67   0.000     2.327276    6.166713
             |
        turn |
         32  |   5011.514   1055.483     4.75   0.000      2784.64    7238.387
         33  |   5166.623   1173.769     4.40   0.000     2690.186     7643.06
         34  |   5058.718   943.2618     5.36   0.000     3068.609    7048.826
         35  |   4229.605   876.5359     4.83   0.000     2380.276    6078.934
         36  |   5187.873   879.5689     5.90   0.000     3332.144    7043.601
         37  |   4947.753    377.608    13.10   0.000     4151.069    5744.436
         38  |   5900.982   257.8047    22.89   0.000     5357.061    6444.902
         39  |   1873.197   545.9393     3.43   0.003     721.3655    3025.028
         40  |   486.6475   131.9353     3.69   0.002     208.2883    765.0068
         41  |   1896.418   88.71514    21.38   0.000     1709.245    2083.591
         42  |  -679.5457   245.6727    -2.77   0.013     -1197.87   -161.2216
         43  |   730.7564   367.7508     1.99   0.063    -45.12995    1506.643
         44  |      94.51   558.0713     0.17   0.868    -1082.917    1271.937
         45  |   1680.927   633.8962     2.65   0.017     343.5231    3018.331
         46  |  -1073.264   424.6195    -2.53   0.022    -1969.133   -177.3954
         48  |  -156.3634   1100.978    -0.14   0.889    -2479.223    2166.496
         51  |  -57.01098   1510.432    -0.04   0.970    -3243.744    3129.722
             |
       _cons |  -9001.443   2893.478    -3.11   0.006    -15106.15   -2896.737
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                   Chi-sq(1) P-val =    0.0136
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               67.103
                         (Kleibergen-Paap rk Wald F statistic):         66.028
Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                         15% maximal IV size              8.96
                                         20% maximal IV size              6.66
                                         25% maximal IV size              5.53
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Warning: estimated covariance matrix of moment conditions not of full rank.
         overidentification statistic not reported, and standard errors and
         model tests should be interpreted with caution.
Possible causes:
         number of clusters insufficient to calculate robust covariance matrix
         singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
partial option may address problem.
------------------------------------------------------------------------------
Instrumented:         weight
Included instruments: 32.turn 33.turn 34.turn 35.turn 36.turn 37.turn 38.turn
                      39.turn 40.turn 41.turn 42.turn 43.turn 44.turn 45.turn
                      46.turn 48.turn 51.turn
Excluded instruments: length
------------------------------------------------------------------------------

. weakivtest // effective F-stat = Kleibergen-Paap rk Wald F statistic
(obs=74)

Montiel-Pflueger robust weak instrument test
--------------------------------------------
Effective F statistic:       66.028
Confidence level alpha:          5%
--------------------------------------------

--------------------------------------------
Critical Values             TSLS      LIML
--------------------------------------------
% of Worst Case Bias
tau=5%                    37.418    37.418
tau=10%                   23.109    23.109
tau=20%                   15.062    15.062
tau=30%                   12.039    12.039
--------------------------------------------

Would anyone know what is going wrong with the Effective F-stat in the first example (-ivreghdfe- + -weakivtest-)?

References
Andrews I, Stock JH, Sun L. Weak Instruments in IV Regression: Theory and Practice. Annual Review of Economics. 2019.

Last edited by Paula de Souza Leao Spinola; 24 Jan 2023, 15:36.

Tags: None

Paula de Souza Leao Spinola

Join Date: Jun 2015
Posts: 384

24 Jan 2023, 15:48

I also get an Effective F-stat = KP F-stat when absorbing the FE by demeaning (in the example above) - following Professor Jeff Wooldridge recommendation in the post IV estimation with HDFE and weakivtest - Statalist
The issue I am trying to sort is how to get F-stat = KP F-stat (considering single IV) after running -ivreghdfe- (given that I don't seem to have any other option because of the size of my data and number of FE categories).

Code:

. // Absorbing FE manually (i.e., demeaning)
. foreach v in price weight length {
  2.         egen `v'bar = mean(`v'), by(turn)
  3.         gen `v'd = `v' - `v'bar
  4. }

. ivreg2 priced (weightd=lengthd), cluster(turn) small

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on turn

Number of clusters (turn) =         18                Number of obs =       74
                                                      F(  1,    17) =    28.52
                                                      Prob > F      =   0.0001
Total (centered) SS     =  436283540.4                Centered R2   =   0.4359
Total (uncentered) SS   =  436283540.4                Uncentered R2 =   0.4359
Residual SS             =  246111883.9                Root MSE      =     1849

------------------------------------------------------------------------------
             |               Robust
      priced | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     weightd |   4.246995   .7952583     5.34   0.000     2.569146    5.924843
       _cons |  -.0000174   .0000812    -0.21   0.833    -.0001887     .000154
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                   Chi-sq(1) P-val =    0.0136
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):               87.844
                         (Kleibergen-Paap rk Wald F statistic):         86.437
Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                         15% maximal IV size              8.96
                                         20% maximal IV size              6.66
                                         25% maximal IV size              5.53
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         0.000
                                                 (equation exactly identified)
------------------------------------------------------------------------------
Instrumented:         weightd
Excluded instruments: lengthd
------------------------------------------------------------------------------

. weakivtest // same as KP F-stat, as expected
(obs=74)

Montiel-Pflueger robust weak instrument test
--------------------------------------------
Effective F statistic:       86.437
Confidence level alpha:          5%
--------------------------------------------

--------------------------------------------
Critical Values             TSLS      LIML
--------------------------------------------
% of Worst Case Bias
tau=5%                    37.418    37.418
tau=10%                   23.109    23.109
tau=20%                   15.062    15.062
tau=30%                   12.039    12.039
--------------------------------------------

.
. ivregress 2sls priced (weightd=lengthd), vce(cluster turn) small

Instrumental variables 2SLS regression            Number of obs   =         74
                                                  F(  1,    17)   =      28.52
                                                  Prob > F        =     0.0001
                                                  R-squared       =     0.4359
                                                  Adj R-squared   =     0.4281
                                                  Root MSE        =     1848.8

                                  (Std. err. adjusted for 18 clusters in turn)
------------------------------------------------------------------------------
             |               Robust
      priced | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     weightd |   4.246995   .7952583     5.34   0.000     2.569146    5.924843
       _cons |  -.0000174   .0000812    -0.21   0.833    -.0001887     .000154
------------------------------------------------------------------------------
Instrumented: weightd
 Instruments: lengthd

. weakivtest // same as KP F-stat, as expected
(obs=74)

Montiel-Pflueger robust weak instrument test
--------------------------------------------
Effective F statistic:       86.437
Confidence level alpha:          5%
--------------------------------------------

--------------------------------------------
Critical Values             TSLS      LIML
--------------------------------------------
% of Worst Case Bias
tau=5%                    37.418    37.418
tau=10%                   23.109    23.109
tau=20%                   15.062    15.062
tau=30%                   12.039    12.039
--------------------------------------------

.
end of do-file

.

However, the F-stat above (86.437) is different from the one we get when we include the FE as regressors to to the ivreg2 regression (66.028) - I am not sure why.

Last edited by Paula de Souza Leao Spinola; 24 Jan 2023, 16:03.

Announcement

Effective F-stat different from KP F-stat when running ivreghdfe + weakivtest

Comment