Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Effective F-stat different from KP F-stat when running ivreghdfe + weakivtest

    Because the postestimation command -weakivtest- works after -ivreg2- but not after -ivreghdfe-, I am using the following program to trick Stata into reading the output of ivreghdfe as ivreg2 so that -weakivtest- runs afterwards. I extracted this program from the replication package Code For: Peer Effect in Product Adoption (openicpsr.org). I am doing this because my real data is massive, and in my iv estimation I have 3 FE of thousands of categories which I am absorbing with -ivreghdfe-. Running -ivreg2- doesn't seem feasible in my context.

    Code:
    * Function that makes previously iv estimated regression with HD be ivreg2
    prog pretend_to_be_ivreg2, eclass
        ereturn local cmd = "ivreg2"
    end
    I am estimating several specifications, some with a single IV and others with multiple IVs. I have managed to use -weakivtest- after -ivreghdfe- followed by -pretend_to_be_ivreg2- (function defined above). However, I am getting an Effective F-stat which is much lower than Kleibergen-Paap rk Wald F statistic in specifications where I have a single instrument. According to Andrews et al (2019), they should be the identical in the case of single IV (and single endogenous variable). I replicated this issue with the simple example below.

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    . ivreghdfe price (weight=length), absorb(turn) cluster(turn) // coeff: 4.246995, SE: .8023496, Fstat: 84.916
    (dropped 4 singleton observations)
    (MWFE estimator converged in 1 iterations)
    
    IV (2SLS) estimation
    --------------------
    
    Estimates efficient for homoskedasticity only
    Statistics robust to heteroskedasticity and clustering on turn
    
    Number of clusters (turn) =         14                Number of obs =       70
                                                          F(  1,    13) =    28.02
                                                          Prob > F      =   0.0001
    Total (centered) SS     =  436283540.4                Centered R2   =   0.4359
    Total (uncentered) SS   =  436283540.4                Uncentered R2 =   0.4359
    Residual SS             =  246111883.9                Root MSE      =     1902
    
    ------------------------------------------------------------------------------
                 |               Robust
           price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          weight |   4.246995   .8023496     5.29   0.000     2.513624    5.980366
    ------------------------------------------------------------------------------
    Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                       Chi-sq(1) P-val =    0.0136
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic):               82.964
                             (Kleibergen-Paap rk Wald F statistic):         84.916
    Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                             15% maximal IV size              8.96
                                             20% maximal IV size              6.66
                                             25% maximal IV size              5.53
    Source: Stock-Yogo (2005).  Reproduced by permission.
    NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
    ------------------------------------------------------------------------------
    Hansen J statistic (overidentification test of all instruments):         0.000
                                                     (equation exactly identified)
    ------------------------------------------------------------------------------
    Instrumented:         weight
    Excluded instruments: length
    Partialled-out:       _cons
                          nb: total SS, model F and R2s are after partialling-out;
                              any small-sample adjustments include partialled-out
                              variables in regressor count K
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
            turn |        14          14           0    *|
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    
    . pretend_to_be_ivreg2
    
    . weakivtest // 16.256
    (obs=70)
    
    Montiel-Pflueger robust weak instrument test
    --------------------------------------------
    Effective F statistic:       16.256
    Confidence level alpha:          5%
    --------------------------------------------
    
    --------------------------------------------
    Critical Values             TSLS      LIML
    --------------------------------------------
    % of Worst Case Bias
    tau=5%                    37.418    37.418
    tau=10%                   23.109    23.109
    tau=20%                   15.062    15.062
    tau=30%                   12.039    12.039
    --------------------------------------------
    
    .
    end of do-file
    
    .
    When I run the same regression with ivreg2 I get same values for the effective F-stat and KP F-stat, as expected.

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    .
    . ivreg2 price (weight=length) i.turn, cluster(turn) small // coef: 4.246995, SE: .9098988, KP Fstat: 66.028
    
    IV (2SLS) estimation
    --------------------
    
    Estimates efficient for homoskedasticity only
    Statistics robust to heteroskedasticity and clustering on turn
    
    Number of clusters (turn) =         18                Number of obs =       74
                                                          F( 18,    17) =     0.00
                                                          Prob > F      =   1.0000
    Total (centered) SS     =  635065396.1                Centered R2   =   0.6125
    Total (uncentered) SS   =   3447834321                Uncentered R2 =   0.9286
    Residual SS             =  246111883.9                Root MSE      =     2115
    
    ------------------------------------------------------------------------------
                 |               Robust
           price | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          weight |   4.246995   .9098988     4.67   0.000     2.327276    6.166713
                 |
            turn |
             32  |   5011.514   1055.483     4.75   0.000      2784.64    7238.387
             33  |   5166.623   1173.769     4.40   0.000     2690.186     7643.06
             34  |   5058.718   943.2618     5.36   0.000     3068.609    7048.826
             35  |   4229.605   876.5359     4.83   0.000     2380.276    6078.934
             36  |   5187.873   879.5689     5.90   0.000     3332.144    7043.601
             37  |   4947.753    377.608    13.10   0.000     4151.069    5744.436
             38  |   5900.982   257.8047    22.89   0.000     5357.061    6444.902
             39  |   1873.197   545.9393     3.43   0.003     721.3655    3025.028
             40  |   486.6475   131.9353     3.69   0.002     208.2883    765.0068
             41  |   1896.418   88.71514    21.38   0.000     1709.245    2083.591
             42  |  -679.5457   245.6727    -2.77   0.013     -1197.87   -161.2216
             43  |   730.7564   367.7508     1.99   0.063    -45.12995    1506.643
             44  |      94.51   558.0713     0.17   0.868    -1082.917    1271.937
             45  |   1680.927   633.8962     2.65   0.017     343.5231    3018.331
             46  |  -1073.264   424.6195    -2.53   0.022    -1969.133   -177.3954
             48  |  -156.3634   1100.978    -0.14   0.889    -2479.223    2166.496
             51  |  -57.01098   1510.432    -0.04   0.970    -3243.744    3129.722
                 |
           _cons |  -9001.443   2893.478    -3.11   0.006    -15106.15   -2896.737
    ------------------------------------------------------------------------------
    Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                       Chi-sq(1) P-val =    0.0136
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic):               67.103
                             (Kleibergen-Paap rk Wald F statistic):         66.028
    Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                             15% maximal IV size              8.96
                                             20% maximal IV size              6.66
                                             25% maximal IV size              5.53
    Source: Stock-Yogo (2005).  Reproduced by permission.
    NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
    ------------------------------------------------------------------------------
    Warning: estimated covariance matrix of moment conditions not of full rank.
             overidentification statistic not reported, and standard errors and
             model tests should be interpreted with caution.
    Possible causes:
             number of clusters insufficient to calculate robust covariance matrix
             singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
    partial option may address problem.
    ------------------------------------------------------------------------------
    Instrumented:         weight
    Included instruments: 32.turn 33.turn 34.turn 35.turn 36.turn 37.turn 38.turn
                          39.turn 40.turn 41.turn 42.turn 43.turn 44.turn 45.turn
                          46.turn 48.turn 51.turn
    Excluded instruments: length
    ------------------------------------------------------------------------------
    
    . weakivtest // effective F-stat = Kleibergen-Paap rk Wald F statistic
    (obs=74)
    
    Montiel-Pflueger robust weak instrument test
    --------------------------------------------
    Effective F statistic:       66.028
    Confidence level alpha:          5%
    --------------------------------------------
    
    --------------------------------------------
    Critical Values             TSLS      LIML
    --------------------------------------------
    % of Worst Case Bias
    tau=5%                    37.418    37.418
    tau=10%                   23.109    23.109
    tau=20%                   15.062    15.062
    tau=30%                   12.039    12.039
    --------------------------------------------
    Would anyone know what is going wrong with the Effective F-stat in the first example (-ivreghdfe- + -weakivtest-)?



    References
    Andrews I, Stock JH, Sun L. Weak Instruments in IV Regression: Theory and Practice. Annual Review of Economics. 2019.
    Last edited by Paula de Souza Leao Spinola; 24 Jan 2023, 15:36.

  • #2
    I also get an Effective F-stat = KP F-stat when absorbing the FE by demeaning (in the example above) - following Professor Jeff Wooldridge recommendation in the post IV estimation with HDFE and weakivtest - Statalist
    The issue I am trying to sort is how to get F-stat = KP F-stat (considering single IV) after running -ivreghdfe- (given that I don't seem to have any other option because of the size of my data and number of FE categories).

    Code:
    . // Absorbing FE manually (i.e., demeaning)
    . foreach v in price weight length {
      2.         egen `v'bar = mean(`v'), by(turn)
      3.         gen `v'd = `v' - `v'bar
      4. }
    
    . ivreg2 priced (weightd=lengthd), cluster(turn) small
    
    IV (2SLS) estimation
    --------------------
    
    Estimates efficient for homoskedasticity only
    Statistics robust to heteroskedasticity and clustering on turn
    
    Number of clusters (turn) =         18                Number of obs =       74
                                                          F(  1,    17) =    28.52
                                                          Prob > F      =   0.0001
    Total (centered) SS     =  436283540.4                Centered R2   =   0.4359
    Total (uncentered) SS   =  436283540.4                Uncentered R2 =   0.4359
    Residual SS             =  246111883.9                Root MSE      =     1849
    
    ------------------------------------------------------------------------------
                 |               Robust
          priced | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
         weightd |   4.246995   .7952583     5.34   0.000     2.569146    5.924843
           _cons |  -.0000174   .0000812    -0.21   0.833    -.0001887     .000154
    ------------------------------------------------------------------------------
    Underidentification test (Kleibergen-Paap rk LM statistic):              6.096
                                                       Chi-sq(1) P-val =    0.0136
    ------------------------------------------------------------------------------
    Weak identification test (Cragg-Donald Wald F statistic):               87.844
                             (Kleibergen-Paap rk Wald F statistic):         86.437
    Stock-Yogo weak ID test critical values: 10% maximal IV size             16.38
                                             15% maximal IV size              8.96
                                             20% maximal IV size              6.66
                                             25% maximal IV size              5.53
    Source: Stock-Yogo (2005).  Reproduced by permission.
    NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
    ------------------------------------------------------------------------------
    Hansen J statistic (overidentification test of all instruments):         0.000
                                                     (equation exactly identified)
    ------------------------------------------------------------------------------
    Instrumented:         weightd
    Excluded instruments: lengthd
    ------------------------------------------------------------------------------
    
    . weakivtest // same as KP F-stat, as expected
    (obs=74)
    
    Montiel-Pflueger robust weak instrument test
    --------------------------------------------
    Effective F statistic:       86.437
    Confidence level alpha:          5%
    --------------------------------------------
    
    --------------------------------------------
    Critical Values             TSLS      LIML
    --------------------------------------------
    % of Worst Case Bias
    tau=5%                    37.418    37.418
    tau=10%                   23.109    23.109
    tau=20%                   15.062    15.062
    tau=30%                   12.039    12.039
    --------------------------------------------
    
    .
    . ivregress 2sls priced (weightd=lengthd), vce(cluster turn) small
    
    Instrumental variables 2SLS regression            Number of obs   =         74
                                                      F(  1,    17)   =      28.52
                                                      Prob > F        =     0.0001
                                                      R-squared       =     0.4359
                                                      Adj R-squared   =     0.4281
                                                      Root MSE        =     1848.8
    
                                      (Std. err. adjusted for 18 clusters in turn)
    ------------------------------------------------------------------------------
                 |               Robust
          priced | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
         weightd |   4.246995   .7952583     5.34   0.000     2.569146    5.924843
           _cons |  -.0000174   .0000812    -0.21   0.833    -.0001887     .000154
    ------------------------------------------------------------------------------
    Instrumented: weightd
     Instruments: lengthd
    
    . weakivtest // same as KP F-stat, as expected
    (obs=74)
    
    Montiel-Pflueger robust weak instrument test
    --------------------------------------------
    Effective F statistic:       86.437
    Confidence level alpha:          5%
    --------------------------------------------
    
    --------------------------------------------
    Critical Values             TSLS      LIML
    --------------------------------------------
    % of Worst Case Bias
    tau=5%                    37.418    37.418
    tau=10%                   23.109    23.109
    tau=20%                   15.062    15.062
    tau=30%                   12.039    12.039
    --------------------------------------------
    
    .
    end of do-file
    
    .
    However, the F-stat above (86.437) is different from the one we get when we include the FE as regressors to to the ivreg2 regression (66.028) - I am not sure why.
    Last edited by Paula de Souza Leao Spinola; 24 Jan 2023, 16:03.

    Comment

    Working...
    X