How to manually calculate standard errors for instrumental variables?

Tarjei W. Havneraas

Join Date: Nov 2016
Posts: 136

How to manually calculate standard errors for instrumental variables?

15 Jun 2020, 02:41

Hi

I am working on statistical inference with instrumental variables (IV) following Wooldridge (2016) Introductory Econometrics, Ch. 15. I am using the Card data set (like the book), with wages as outcome (y), education as a endogenous continuous treatment (x) and distance to college as a binary IV (z).

I want to calculate the standard errors manually, and preferably additionally in matrix form using Mata. So far, I am able to calculate coefficients but I can't seem to obtain the correct standard errors and would be happy for input on this.

I obtain the point estimate for βiv with the Wald-estimator:

βiv = E[y|z=1] - E[y|z=1] / E[x|z=1] - E[x|z=1]

Code:

. use http://pped.org/card.dta, clear // Load Card data set

. keep nearc4 educ lwage id 

. rename nearc4 z

. rename educ x

. rename lwage y

. bysort z: sum y x

-----------------------------------------------------------------------------------------------------------------
-> z = 0

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
           y |        957    6.155494    .4328417    4.60517   7.474772
           x |        957    12.69801    2.791523          1         18

-----------------------------------------------------------------------------------------------------------------
-> z = 1

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
           y |      2,053    6.311401    .4402214    4.60517   7.784889
           x |      2,053    13.52703    2.580455          2         18


. di (6.311401-6.155494)/(13.52703-12.69801)
.18806181

The point estimates can be cross checked by -ivregress-

Code:

. ivregress 2sls y (x=z), nohe
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .1880626   .0262826     7.16   0.000     .1365496    .2395756
       _cons |   3.767472   .3487458    10.80   0.000     3.083943    4.451001
------------------------------------------------------------------------------

I now want to proceed by calculating the standard errors. Wooldridge (2016, p. 466) writes that standard errors for βiv is obtained by using the square root of the estimated asymptotic variance, where the latter is obtained by

Var(βiv) = sigma^2/(SSTx*R^2x,z)

First, the total sum of squares for x (SSTx), is obtained by

Code:

. egen x_bar = mean(x)

. gen SSTx = (x-x_bar)^2

. quiet sum SSTx

. di r(sum)
21562.08

Second, R^2x,z is the squared population correlation between x and z, obtained from the regression output

Code:

. reg x z, nohe
------------------------------------------------------------------------------
           x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           z |    .829019   .1036988     7.99   0.000     .6256912    1.032347
       _cons |   12.69801   .0856416   148.27   0.000     12.53009    12.86594
------------------------------------------------------------------------------

. di .829^2
.687241

Finally, sigma^2 is the error variance. Wooldridge says to use the IV residuals û in calculating the error variance

sigma^2 = 1/(n-2)*sum(û^2)

Code:

. quiet reg x z

. predict x_hat
(option xb assumed; fitted values)

. quiet reg y x_hat, nohe

. predict iv_resid
(option xb assumed; fitted values)

. quiet sum iv_resid

. di r(sum)
18848.115

. di (18848.114)^2
3.553e+08

. gen sigma_squared = 3.553e+08

. tabstat sigma_squared, format(%20.2f)

    variable |      mean
-------------+----------
sigma_squa~d |        355300000.00
------------------------

. di (1/(3010-2))*355300000
118118.35

Thus, when finally substituting my values into the formula for Var(βiv) and standard error I get

Code:

. di 118118.35/(21562.08*.687241)
7.971089

. * sigma 
. di sqrt(7.971089)
2.8233117

. * se(βiv)
. di 2.8233117/sqrt(21562.08)
.01922709

My main question is: Does someone see where I have miscalculated and can anyone provide the correct calculation?

I have also worked out IV in Mata, although I am stuck at variance-estimation here as well:

Code:

. mata
: 
: y=st_data(.,"y")

: 
: X=st_data(.,"x")

: 
: Z=st_data(.,"z")

: 
: X = X, J(rows(X),1,1) // Add constant

: 
: Z = Z, J(rows(Z),1,1) // Add constant

: 
: b_iv = luinv(Z'*X)*Z'*y

: 
: b_iv
                 1
    +---------------+
  1 |  .1880626042  |
  2 |  3.767472015  |
    +---------------+

: end

(I have posted at stats.stackexchange as well)

Tags: IV standard error Mata

FernandoRios

Join Date: Apr 2014

Posts: 2479
#2

15 Jun 2020, 06:27

Couple of clues here
1. errors or residuals from the 2sls is: Res=y-x*B_IV. In your code, you were not predicting residuals, but the predicted value x*BIV
2. ivregress as you use it does not adjust for degrees of freedom for the estimation of standard errors. If you want that to be done you need to include the option, "small"

HTH
Comment

Tarjei W. Havneraas

Join Date: Nov 2016
Posts: 136

15 Jun 2020, 09:41

Thank you! Glad you discovered the error in the predict command. sigma^2 now becomes 0:

Code:

. reg x z, nohe 
------------------------------------------------------------------------------
           x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           z |    .829019   .1036988     7.99   0.000     .6256912    1.032347
       _cons |   12.69801   .0856416   148.27   0.000     12.53009    12.86594
------------------------------------------------------------------------------

. predict x_hat
(option xb assumed; fitted values)

. quiet reg y x_hat, nohe

. predict iv_resid, residuals

. quiet sum iv_resid

. di r(sum)
-8.027e-06

I have seen Jeff Wooldridge frequently reply to questions on this forum, so I am tagging in case I can get a reply from him. I will keep at it until I find out where the calculation is wrong.

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2479
#4

15 Jun 2020, 09:52

Hi Tarjei
ok, so you are one step closer
your regression ". quiet reg y x_hat,"
will get the right betas, but residuals will need to be calculated differently

The right residuals: res=y-x*b_iv
the residuals you are creating: res= y-x_hat*b_iv
Comment
Tarjei W. Havneraas

Join Date: Nov 2016

Posts: 136
#5

18 Jun 2020, 07:20

Thanks for your reply. I have tried several ways to create the correct residuals, among them

Code:

. gen xh=. (3,010 missing values generated) . gen biv=. (3,010 missing values generated) . mata xh=z*invsym(z'*z)*z'*x . mata biv=invsym(xh'*xh)*xh'*y . mata biv 1 +---------------+ 1 | .1880626042 | 2 | 3.767472015 | +---------------+ . mata st_store(.,"biv",biv) st_store(): 3200 conformability error <istmt>: - function returned error

intended to use for res=y-x*b_iv. Do you have any suggestions?

I do suspect that R^2x,z may be off too. I will add that I have gotten a helpful reply on how to solve this using Mata at stackexchange, but it would be interesting to additionally solve this using the step-by-step way outlined by Wooldridge.
Comment

Announcement

How to manually calculate standard errors for instrumental variables?

Comment

Comment

Comment

Comment