Hello Statalist Community,
I hope you are well.
I have been trying to implement the PPML FE IV with the control function, as in this topic. To test for a weak instrument I am running "ivregress 2sls" where I have applied a within transformation manually (for check: xtivreg2 estimates are the same as the estimates obtained by this manual approach) and then calling "weakivtest" to get MO-P Effective F as recommend by Andrews et al. (2019). In theory, if I am not mistaken, the first stage of the Control function approach is the same as TSLS and therefore, MO-P F statistic holds. My question is: how a specification change of a control variable can impact so heavily the strength of the IV itself and therefore, the Effective F statistics?
My code is something like
Y = dependent variable (count)
X2 = EEV
Z = instrument
X1, X3 and X4 = control variables (all of them are continuous)
l_ = log()
id_sector_year = sector#year
id = municipality
In this case, I obtain an Effective F of 1797.96 with a tau=5% of 37.418 (that is, well above)
But if I run (the only change with respect to the former is the log transformation of X4)
The Effective F is now 93.83 (same here, well above)
I have also tested the main regression, that is
And the results are very different: estimates are around 8.96 for X2 in the log case and 1.48 for the linear case. For comparison, the PPML FE (without iv) estimates for X2 are around 0.88. It gives me a "bias" of 10 times and 1.7 times, respectively. A huge difference.
Thank you very much !
I hope you are well.
I have been trying to implement the PPML FE IV with the control function, as in this topic. To test for a weak instrument I am running "ivregress 2sls" where I have applied a within transformation manually (for check: xtivreg2 estimates are the same as the estimates obtained by this manual approach) and then calling "weakivtest" to get MO-P Effective F as recommend by Andrews et al. (2019). In theory, if I am not mistaken, the first stage of the Control function approach is the same as TSLS and therefore, MO-P F statistic holds. My question is: how a specification change of a control variable can impact so heavily the strength of the IV itself and therefore, the Effective F statistics?
My code is something like
Y = dependent variable (count)
X2 = EEV
Z = instrument
X1, X3 and X4 = control variables (all of them are continuous)
l_ = log()
id_sector_year = sector#year
id = municipality
Code:
xi: xtivreg2 Y l_X1 X3 X4 (l_X2 = l_Z) i.id_sector_year, i(id) fe first r
But if I run (the only change with respect to the former is the log transformation of X4)
Code:
xi: xtivreg2 Y l_X1 X3 l_X4 (l_X2 = l_Z) i.id_sector_year, i(id) fe first r
I have also tested the main regression, that is
Code:
* First-stage reghdfe X2 Z X1 X3 X4, absorb(id id_sector_year) res predict double v2hat, r * Second-stage ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(id id_sector_year) vce(cluster id)
Thank you very much !

Comment