Constrained Regression with Instrumental Variables

Christos Makridis

Join Date: Nov 2014

Posts: 157
#1

Constrained Regression with Instrumental Variables

23 Nov 2014, 14:00

Hi everyone,

Is there a way to estimate a constrained regression with instrumental variables? In particular, suppose that I have data on the following variables: w, L, C, S, and the instrument Z. I want to run:

log(w) = (psi -1)*log(L) + (1 - phi)*log(C) + (psi - phi)*log(S) + epsilon

where Z would instrument for log(S).

I looked extensively online and it seems that cnsreg does not support instrumental variables. (Even if I did not IV, I am not quite sure how cnsreg would handle this aforementioned equation?)

Any suggestions would be greatly appreciated -- thank you for your time!
Tags: None

Jorge Eduardo Perez Perez

Join Date: Mar 2014
Posts: 429

24 Nov 2014, 11:43

You can either use a control function approach or use reg3

Code:

clear
set seed 2568
* Generate some data
set obs 200
* Generate some exogenous regressors
gen l=rnormal()
gen c=rnormal()
* Generate endogenous regressors and instrument
gen u=rnormal()
gen v=0.5*rnormal()
gen z=rnormal()
gen s=z+u
* Generate y
glo psi=2
glo phi=4
gen y=($psi-1)*l+(1-$phi)*c+($psi-$phi)*s+u+v

* OLS biased
reg y l c s
* Constrained OLS is also biased
* _b[l]=psi-1
* _b[c]=1-phi
* So _b[s]=psi-phi = _b[l]+_b[c]
constraint 1 s= l + c
cnsreg y l c s, constraints(1)
* IV
ivregress 2sls y l c (s=z)
* IV with reg3
reg3 y l c s, inst(l c z) 2sls
* Constrained cas
* Control function
reg s z l c
predict res, resid
cnsreg y l c s res, constraints(1)
* Need to adjust standard errors
* reg3
reg3 y l c s, inst(l c z) 2sls constraints(1)

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com

Comment

Christos Makridis

Join Date: Nov 2014

Posts: 157
#3

24 Nov 2014, 13:55

Great suggestion Jorge and thank you for your time -- I just implemented it and it works! Could you clarify about how the standard errors should be adjusted? I am not sure the theory about clustered standard errors in context of 3SLS, but I imagine that there should be robust standard errors -- didn't find much online though.

My code (in general terms) is now:

constraint define 1 L+ C = S
global stage1 "(first: w L C S controls)"
global stage2 "(second: S IV controls)"
reg3 $stage1 $stage2, endog(S) constr(1)
Comment
Jorge Eduardo Perez Perez

Join Date: Mar 2014

Posts: 429
#4

24 Nov 2014, 14:19

My comment on standard error adjustment was for the control function approach, where you have to adjust for the residuals of the first stage being a generated regressor. For reg3 with the 2sls option (which is missing in your code), it gives the same standard errors as ivregress 2sls with a small sample adjustment

Code:

ivregress 2sls y l c (s=z), small reg3 y l c s, inst(l c z) 2sls

which are just the regular IV standard errors. These are non-robust though, and reg3 does not allow robust standard errors. A quick and dirty solution would be to bootstrap the whole thing:

Code:

bs, rep(100) : reg3 y l c s, inst(l c z) 2sls constraints(1)

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#5

24 Nov 2014, 16:40

Thank you again for checking on the thread and clarifying! I misunderstood the full extent of your initial post, but endeavored the suggestions. While I grasped the concept, I think I will defer to purely the reg3 code I wrote (based on your comments) and the bootstrap suggestion. I will work more on understanding whether bootstrap is appropriate here.
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#6

06 Apr 2015, 01:25

Here's another question on this broader topic: what's the difference in stata between writing "inst() versus writing out the instruments "by hand"? By hand, I mean that I'm writing the equations explicitly in their "2SLS form". What does the 2SLS command do instead of the 3SLS? A comment earlier suggested that it affects the computation of standard errors; but, what does the 3SLS command in stata do differently?

Here's example code below. What I'm trying to do is instrument leisure, nondurables, electricity consumption, and the cubic in air quality of a regression of wages on those regressors, plus controls and fixed effects.

constraint define 2 -1*(lleisure + lcons_nondur) = laqi + laqi2 + laqi3
global stage1 "(first: lwage_hourly lleisure lcons_nondur lcons_elect laqi laqi2 laqi3 $X $stecon i.year i.industry i.county)"
global stage2 "(second: laqi $ivwindsemi1 $X $stecon i.year i.industry i.county)"
global stage3 "(third: laqi2 $ivwindsemi2 $X $stecon i.year i.industry i.county)"
global stage4 "(fourth: laqi3 $ivwindsemi3 $X $stecon i.year i.industry i.county)"
global stage5 "(fifth: lleisure $ivweather $X $stecon i.year i.industry i.county)"
global stage6 "(sixth: lcons_nondur $ivcons $X $stecon i.year i.industry i.county)"
global stage7 "(seventh: lcons_elect $ivcons $X $stecon i.year i.industry i.county)"
quietly reg3 $stage1 $stage2 $stage3 $stage4 $stage5 $stage6 $stage7 [weight=pweight_count], constr(2) 3sls

The only role that electricity consumption plays in stage7 is as a control since the $ivcons instruments are interactions of electricity and some other fixed effects -- so by including it as a control I am exploiting electricity consumption variation within the categories of the interacted dummy variables.

On a side note, if there's a more efficient way to write this, I'd love to know. These regressions take a while to run.
Comment
Christos Makridis

Join Date: Nov 2014

Posts: 157
#7

13 Aug 2015, 08:19

reg3 with lots of FEs is quite slow I've realized. One of the more efficient ways seems to be using a control function approach noted in an earlier post by Jorge. Since S, L, and C are all endogenous, does the control function approach go something like this?

reg S iv_S controls
predict res1, resid
reg L iv_L controls
predict res2, resid
reg C iv_C controls
predict res3, resid
reg w S L C res1 res2 res3 controls
Comment

Jorge Eduardo Perez Perez

Join Date: Mar 2014
Posts: 429

13 Aug 2015, 09:10

No, you have to regress each endogenous variable on the full set of instruments.

Code:

* Generate example data
clear
set obs 1000
set seed 98135136
* Common unobserved factor
gen u=rnormal()
* Instruments
gen z1=rnormal()
gen z2=rnormal()
* Endogenous variables
gen x1=z1+8*u+rnormal()
gen x2=z2+8*u+rnormal()
* Dependent variable
gen y= 3*x1 + 5*x2 + 5*u + rnormal()
*  End of data generation

* OLS is biased
reg y x1 x2
* IV is unbiased
ivregress 2sls y (x1 x2 = z1 z2)
* Control function approach running separate first stages doesn't give IV results
reg x1 z1
predict e1, resid
reg x2 z2
predict e2, resid
reg y x1 x2 e1 e2

* Control function approach yields IV estimates
drop e1 e2
reg x1 z1 z2
predict e1, resid
reg x2 z1 z2
predict e2, resid
reg y x1 x2 e1 e2

Jorge Eduardo Pérez Pérez
www.jorgeperezperez.com

Comment

Christos Makridis

Join Date: Nov 2014

Posts: 157
#9

13 Aug 2015, 23:26

Thanks Jorge!
Comment

Announcement

Constrained Regression with Instrumental Variables

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment