Weak IV and log transformation of a control variable

Mauricio Carvalho

Join Date: May 2018

Posts: 22
#1

Weak IV and log transformation of a control variable

18 Mar 2023, 09:51

Hello Statalist Community,

I hope you are well.

I have been trying to implement the PPML FE IV with the control function, as in this topic. To test for a weak instrument I am running "ivregress 2sls" where I have applied a within transformation manually (for check: xtivreg2 estimates are the same as the estimates obtained by this manual approach) and then calling "weakivtest" to get MO-P Effective F as recommend by Andrews et al. (2019). In theory, if I am not mistaken, the first stage of the Control function approach is the same as TSLS and therefore, MO-P F statistic holds. My question is: how a specification change of a control variable can impact so heavily the strength of the IV itself and therefore, the Effective F statistics?

My code is something like

Y = dependent variable (count)
X2 = EEV
Z = instrument
X1, X3 and X4 = control variables (all of them are continuous)
l_ = log()
id_sector_year = sector#year
id = municipality

Code:

xi: xtivreg2 Y l_X1 X3 X4 (l_X2 = l_Z) i.id_sector_year, i(id) fe first r

In this case, I obtain an Effective F of 1797.96 with a tau=5% of 37.418 (that is, well above)

But if I run (the only change with respect to the former is the log transformation of X4)

Code:

xi: xtivreg2 Y l_X1 X3 l_X4 (l_X2 = l_Z) i.id_sector_year, i(id) fe first r

The Effective F is now 93.83 (same here, well above)

I have also tested the main regression, that is

Code:

* First-stage reghdfe X2 Z X1 X3 X4, absorb(id id_sector_year) res predict double v2hat, r * Second-stage ppmlhdfe Y v2hat X2 X1 X3 X4, absorb(id id_sector_year) vce(cluster id)

And the results are very different: estimates are around 8.96 for X2 in the log case and 1.48 for the linear case. For comparison, the PPML FE (without iv) estimates for X2 are around 0.88. It gives me a "bias" of 10 times and 1.7 times, respectively. A huge difference.

Thank you very much !
Tags: control function, fixed effects, log transformation, PPML, weak instrument
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

18 Mar 2023, 12:34

I do not think there is anything surprising here. log X is not X, and if you use log X instead of X as a dependent variable in the first stage or as a regressor in the second stage, you will get different results.

Regarding the second issue, PPML is a nonlinear estimator, no? Generally plugging in predicted values in nonlinear estimators is not a correct estimation strategy. You speak of "control function", but what you show is only plugging in predicted first stage values. So I guess the last line of code you are showing is not a consistent estimator of what you want to estimate.
1 like
Comment
Mauricio Carvalho

Join Date: May 2018

Posts: 22
#3

18 Mar 2023, 13:15

Thank you very much, Prof Kolev

Yes, they are different things but the difference should be that large?

Regarding the control function. I am aware of that problem and I believe I am using the residual from the first-stage, right? Following the -reghdfe- help file, the command "res" means "save regression residuals" . Correct?

Thank you again!
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#4

19 Mar 2023, 02:47

These are effects of non-linearities, when you use some strongly non-linear function instead of the original variable. There is no way to know in advance whether using non-linear functions would have huge or small effect. You should not worry about this.

Yes, you are right, you have used the residual, which is the control function approach. I got tripped over because you call it hat and you abbreviated the residual prediction to "r." Between 0.88 and 1.48 is not a huge difference and can be expected as a result of the instrumentation.
1 like
Comment

Announcement

Weak IV and log transformation of a control variable

Comment

Comment

Comment