Why is R-squared reported in 2SLS regressions in several top-tier journals?

Darian Mistoha

Join Date: Aug 2024
Posts: 9

Why is R-squared reported in 2SLS regressions in several top-tier journals?

03 Sep 2024, 03:33

Hello together,

I am currently running a robustness test, using a 2SLS regression.

Code:

 eststo: ivreghdfe DC y1 y2 y3 y4 y5 y6 dummy (y7 = L.y1 L.y2 L.y3 L.y4 L.y5 L.y6 dummy L2.y7), absorb (Industry Year Country) vce(cluster Firm) noconstant

Code:

IV (2SLS) estimation
--------------------

Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on Firm

Number of clusters (Firm) =     11250                Number of obs =    77772
                                                      F(  8, 11249) =    75.59
                                                      Prob > F      =   0.0000
Total (centered) SS     =  533.5124222                Centered R2   =   0.0337
Total (uncentered) SS   =  533.5124222                Uncentered R2 =   0.0337
Residual SS             =  515.5344716                Root MSE      =   .08906

-----------------------------------------------------------------------------------
                  |               Robust
       DC | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
------------------+----------------------------------------------------------------
     y7 |  -.0017295   .0019756    -0.88   0.381     -.005602    .0021431
     y1 |   .0050585   .0011079     4.57   0.000     .0028869    .0072301
     y2 |  -.0074961   .0008742    -8.57   0.000    -.0092098   -.0057825
     y3 |  -.0329384   .0044186    -7.45   0.000    -.0415996   -.0242772
     y4 |  -.0043362   .0014496    -2.99   0.003    -.0071778   -.0014947
     y5 |  -.0000126    .000036    -0.35   0.728    -.0000832    .0000581
     y6 |   -.073603   .0044549   -16.52   0.000    -.0823353   -.0648707
dummy |   .0023218   .0017758     1.31   0.191     -.001159    .0058026
-----------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic):           2229.954
                                                   Chi-sq(7) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):             4271.606
                         (Kleibergen-Paap rk Wald F statistic):       1357.372
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    19.86
                                         10% maximal IV relative bias    11.29
                                         20% maximal IV relative bias     6.73
                                         30% maximal IV relative bias     5.07
                                         10% maximal IV size             31.50
                                         15% maximal IV size             17.38
                                         20% maximal IV size             12.48
                                         25% maximal IV size              9.93
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):       354.263
                                                   Chi-sq(6) P-val =    0.0000
------------------------------------------------------------------------------
Instrumented:         y7
Included instruments: y1 y2 y3 y4 y5
                      y6 dummy
Excluded instruments: L.y2 L.y3 L.y4 L.y5 L.y6
                      L.dummy L2.y7
Partialled-out:       _cons
                      nb: total SS, model F and R2s are after partialling-out;
                          any small-sample adjustments include partialled-out
                          variables in regressor count K
Duplicates:           initial_debtcost2
------------------------------------------------------------------------------

Absorbed degrees of freedom:
----------------------------------------------------------+
      Absorbed FE | Categories  - Redundant  = Num. Coefs |
------------------+---------------------------------------|
     Industry |      1104           0        1104     |
    YEAR |      8722          31        8691     |
 COUNTRY |      3348         381        2967    ?|
----------------------------------------------------------+
? = number of redundant parameters may be higher

Based on this article (https://www.stata.com/support/faqs/s...least-squares/) I thought, that showing R-squared does not make any sense when running a 2SLS regression, still I find many papers from top-tier journals showing R-squared for 2SLS regression. Should I thus, despite the STATA article also include my centered R-squared or not?

I am a bit confused on what to show and what not to show. I hope you can help me! :-)

Last edited by Darian Mistoha; 03 Sep 2024, 03:38.

Tags: None

Announcement

Why is R-squared reported in 2SLS regressions in several top-tier journals?