I am playing around with FE regression models, including vs excluding FEs. My data is at physician-hospital-month level. My outcome variable informs how much money a given physician spends in a given hospital at a given month.
First, I include FEs of year/month and hospital (named cnes below). [I also include a categorical variable for physician age.] The R2 is around 90% and the F-statistic of a set of variables (which I intend to use later as instruments) is around 52.
When I drop hospital FE, my R2 decreases massively to around 3%, which tells us hospitals is a strong determinant of physicians' recorded costs. To my surprise, the F-statistic of the same set of variables increases massively in the other direction: it is almost 700.
Is it normal that R2 is negatively correlated with F-stat? I found it here that the relationship is expected to be positive: https://stats.stackexchange.com/ques...e%20non%2Dzero.
First, I include FEs of year/month and hospital (named cnes below). [I also include a categorical variable for physician age.] The R2 is around 90% and the F-statistic of a set of variables (which I intend to use later as instruments) is around 52.
When I drop hospital FE, my R2 decreases massively to around 3%, which tells us hospitals is a strong determinant of physicians' recorded costs. To my surprise, the F-statistic of the same set of variables increases massively in the other direction: it is almost 700.
Is it normal that R2 is negatively correlated with F-stat? I found it here that the relationship is expected to be positive: https://stats.stackexchange.com/ques...e%20non%2Dzero.
Code:
. reghdfe avg_peer_val iv_age iv_fem iv_uni pat_fem pat_age, absorb(ym cnes age_int) vce(cluster pf_cpfid)
(dropped 79 singleton observations)
(MWFE estimator converged in 6 iterations)
HDFE Linear regression Number of obs = 7,148,919
Absorbing 3 HDFE groups F( 5, 141296) = 39.71
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.8983
Adj R-squared = 0.8982
Within R-sq. = 0.0001
Number of clusters (pf_cpfid) = 141,297 Root MSE = 679.9987
(Std. Err. adjusted for 141,297 clusters in pf_cpfid)
------------------------------------------------------------------------------
| Robust
avg_peer_val | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iv_age | -1.87275 .2679516 -6.99 0.000 -2.39793 -1.34757
iv_fem | -75.88635 8.992532 -8.44 0.000 -93.51154 -58.26115
iv_uni | 18.55822 7.547507 2.46 0.014 3.765247 33.35118
pat_fem | -7.576075 1.116572 -6.79 0.000 -9.764535 -5.387615
pat_age | -.0107603 .0209083 -0.51 0.607 -.0517401 .0302195
_cons | 2243.674 12.12901 184.98 0.000 2219.902 2267.447
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
ym | 90 0 90 |
cnes | 4726 1 4725 |
age_int | 17 1 16 ?|
-----------------------------------------------------+
? = number of redundant parameters may be higher
. test iv_age iv_fem iv_uni
( 1) iv_age = 0
( 2) iv_fem = 0
( 3) iv_uni = 0
F( 3,141296) = 52.74
Prob > F = 0.0000
.
end of do-file
. do "C:\Users\Paula\AppData\Local\Temp\STD33cc_000000.tmp"
. reghdfe avg_peer_val iv_age iv_fem iv_uni pat_fem pat_age, absorb(ym age_int) vce(cluster pf_cpfid)
(MWFE estimator converged in 4 iterations)
HDFE Linear regression Number of obs = 7,148,998
Absorbing 2 HDFE groups F( 5, 141344) = 1173.01
Statistics robust to heteroskedasticity Prob > F = 0.0000
R-squared = 0.0330
Adj R-squared = 0.0330
Within R-sq. = 0.0260
Number of clusters (pf_cpfid) = 141,345 Root MSE = 2096.0537
(Std. Err. adjusted for 141,345 clusters in pf_cpfid)
------------------------------------------------------------------------------
| Robust
avg_peer_val | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
iv_age | -31.59436 1.221342 -25.87 0.000 -33.98816 -29.20055
iv_fem | 411.207 43.82209 9.38 0.000 325.3165 497.0974
iv_uni | 738.8013 27.60647 26.76 0.000 684.6931 792.9094
pat_fem | -386.0751 8.452463 -45.68 0.000 -402.6417 -369.5084
pat_age | 10.79922 .2144286 50.36 0.000 10.37895 11.2195
_cons | 2902.447 62.06236 46.77 0.000 2780.806 3024.088
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
ym | 90 0 90 |
age_int | 17 1 16 |
-----------------------------------------------------+
. test iv_age iv_fem iv_uni // when excluding hospital FE, R2 decreases by a lot and F-stat increases
( 1) iv_age = 0
( 2) iv_fem = 0
( 3) iv_uni = 0
F( 3,141344) = 698.31
Prob > F = 0.0000

Comment