Piecewise Linear Instrumental Variable Estimation

Faye Gao

Join Date: Nov 2014

Posts: 3
#1

Piecewise Linear Instrumental Variable Estimation

06 Aug 2019, 09:23

Does any one know a way to estimate a piecewise linear regression with instrument variable(s)?

I have a model below,

y = a + b*Xabovezero + c*Xbelowzero + control variables + error term.

where Xabovezero = X if X>0, and zero otherwise, and Xbelowzero=X if X<0, and zero otherwise.

In the event that X is endogenous, I want to use an instrument variable K. I wonder how to do it properly in stata. I found a paper discussing it https://www.dbmi.pitt.edu/sites/defa...s/Scheines.pdf but have not found stata ado files.

Any help is appreciated!
Tags: regression
FernandoRios

Join Date: Apr 2014

Posts: 2460
#2

06 Aug 2019, 09:45

Hi Faye
To points on your problem. Why restrict yourself to piecewise linear? You can be more flexible using, for example, a partial linear model. https://www.sciencedirect.com/scienc...65176514001608

In my own research I also found that the control function approach (when residuals from the first stage are included in the model) also works well.
Fernando
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#3

06 Aug 2019, 10:28

Faye: Since you know the threshold you want is zero then that paper you attached is too hard. Is X a (roughly) continuous variable? If it is, I have some suggestions -- one of which is an implementation of Fernando's excellent suggestion to use a control function approach.
Comment
Faye Gao

Join Date: Nov 2014

Posts: 3
#4

06 Aug 2019, 11:00

Thanks for your valuable input, Fernando and Jeff. X is a continuous variable (sort of, integers ranging from -10 to 10) . I could regress X on K (the IV variable) and K*K (to control for non-linearity between K and X) to get the residual (R1), then include it in the original model, i.e.
y = a + b*Xabovezero + c*Xbelowzero + control variables + R1 + error term. Please let me know if the model is inappropriate.

I am curious about how to implement partial linear models in stata. Reading the paper that Fernando referred to is difficult for me at this stage.

Faye

Last edited by Faye Gao; 06 Aug 2019, 11:14.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#5

07 Aug 2019, 06:51

The problem with the CF approach in your case is linearity of the first stage is suspect due to the integer nature of X. But you might use it as an approximation. In that case, I'd include more functions of R1, including R1^2 and even interact R1 with a dummy indicating that X is above zero. Once you go the CF route, you can make it pretty flexible.

A more robust approach (in the sense of consistency), is to estimate, say, a binomial regression separately for X >= 0 and X < 0 and obtain the fitted values. The binomial model need not be correct if you then use the fitted values as instruments -- not regressors.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#6

07 Aug 2019, 06:53

By the way, whether you use the CF approach or generate the IVs as I suggested, the control variables must be included along with K in the first stage (whether it's linear or binomial).
Comment

FernandoRios

Join Date: Apr 2014
Posts: 2460

07 Aug 2019, 07:14

Hi Jeff Wooldridge
If i can ask a follow up question.
for the example Faye proposed, my first instinct was to do a CF approach following the following process:

Code:

y=depvar
x1,x2=exogenos controls
x3 endogenous.
   x3a = x3 if x3>0 and zero otherwise
   x3b = x3 if x3<0 and zero otherwise
z instrument.

Econometric model:
y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+e

CF implementation. First stage:
x3=g0+g1*x1+g2*x3+g3*z+u

Model estimation:
y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+d1*u+e
However, as you indicated, it is common to use interactions in the residuals,
so I thought about the following alternatives
1) y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+d1*u*x3+e
2) y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+d1*u*x3a+d2*u*x3b+e
3) y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+d1*u*(x3>0)+d2*u*(x3<0)+e

In a simple simulation, however, Option 2 does not produce consistent estimates.

So my question is, while CF can be flexible, is the a risk of "misspecification" of the control function part of general concern? or is there any specific guidance on what to do or not to do in these cases.
Thank you
Fernando

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#8

07 Aug 2019, 08:23

Fernando:

Your CF must be such that x3 is a function of the exogenous variables and the CF. That's true in (1) and (3), but not (2). It's obvious for (1). For (3), if you add u*(x3 > 0) and u*(x3 < 0) you get u. But there is no way to write x3 as a function of those interactions in (2). The bottom line is, you should always include u by itself because you're assuming x3 is a linear function of u. After that, you can put in flexible functions.

BTW, in Faye's application, she has to decide where to put the X = 0 because x3 is discrete. I'm guessing in your simulation x3 is continuous and so P(x3 = 0) = 0.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2460
#9

07 Aug 2019, 08:50

Thank you!
That was insightful.
Also, you are correct for the simulation. X3 is continuous.
Best regards
Fernando
Comment
Faye Gao

Join Date: Nov 2014

Posts: 3
#10

02 Oct 2019, 09:54

Thank you both Jeff and Fernando! I read more on the control function and have followed your advice. I am wondering whether my interpretation is correct. Below, I borrow Fernando's notation.
Prior to the use of CF, I get b1>0 and b2<0, based on the original model:

y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+e.

After the use of CF, I get b1>0, b2=0, and d1<0, from estimating the second stage model:
y=a0+a1*x1+a2*x2+b1*x3a+b2*x3b+d1*u+e,

In my mind, the results suggest that the estimates prior to CF (b2<0) is driven by the unexplained part of X (which is d1). But they do not contradict with the inference from the original model, Y has a negative relation to the overall movement in X3b (including both exogenous and unexplained). Can you please let me know if any of this is wrong?

Thanks!
Faye

Last edited by Faye Gao; 02 Oct 2019, 10:02.
Comment

Announcement

Piecewise Linear Instrumental Variable Estimation

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment