Standard error intercept, Frisch-Waugh theorem

Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#1

Standard error intercept, Frisch-Waugh theorem

02 Aug 2019, 16:53

Hi all,

I am applying the Frisch-Waugh-Theorem to partial out a set of fixed effects D and get the OLS estimates and standard errors of the remaining regressors X.
The idea is mainly to use the code to pre-process data, but I would like to understand exactly how to get correct OLS results before doing anything.

Below a MWE in Mata to clarify what I do.
For reference I report also a benchmark when using -areg-.

Code:

cls clear all sysuse auto, clear // Benchmark areg price gear length i.trunk, absorb(turn) // Absorb "manually" in MATA xi i.turn i.trunk gen uno = 1 mata // import y = st_data(., "price") X = st_data(., ("gear", "length", "_Itrunk_*", "uno")) D = st_data(., "_Iturn*") // demeaned X and y M_D = I(rows(y)) - D * qrinv(cross(D,D)) * D' // "residual maker" M_D = I - D(D'D)^(-1)D' y_dem = M_D * y X_dem = M_D * X // OLS using de-meaned variables and corresponding standard errors b1 = qrsolve(X_dem, y_dem) res2 = cross(y_dem - X_dem*b1, y_dem - X_dem*b1) MSE = res2/(rows(X_dem) - (cols(X_dem)+cols(D)-1)) XX_dem = qrinv(cross(X_dem, X_dem)) SE = sqrt(diagonal(XX_dem) * MSE) SE // Compute constant as c = y_bar - X_bar*b1 e = rows(b1) - 1 c = mean(y) - (mean(X[., 1..e]) * b1[1..e])' c end

I hope I haven't made trivial mistakes, in fact it seems that all estimates (including the separately computed constant) correspond to the ones reported by -areg-. However, I am not sure how to get the standard error of the constant without inverting the full (X, D)'(X, D) variance-covariance matrix (which would make very little sense, given the setting).

Thanks for any comments.
The question is cross-posted at https://stats.stackexchange.com/ques...-waugh-theorem
Tags: covariance matrix, least-squares, mata
FernandoRios

Join Date: Apr 2014

Posts: 2469
#2

02 Aug 2019, 18:56

Hi Stefano
This is indeed a nice exercise. Before you go any further, if you intention is to extend areg capabilities, be aware that there are other programs already available in stata that deal with multiple high dimensional fixed effects. The gold standard right now being reghdfe by Sergio Correira
That being said.
You are already on the right track to recover the standard errors of the constant. What I have done on my own version of fixed effects models is to use not the demeaned variables, but the demeaned recentered variables for the regression, see my own modification to your code

Code:

mata // import y = st_data(., "price") X = st_data(., ("gear", "length", "trunk", "uno")) D = st_data(., "ibn.turn*") // demeaned X and y // "residual maker" M_D = I - D(D'D)^(-1)D' M_D = I(rows(y)) - D * invsym(D'*D) * D' y_dem = M_D * y X_dem = M_D * X y_dem=y_dem:+(colsum(y):/rows(y)) X_dem=X_dem:+(colsum(X):/rows(X)) // OLS using de-meaned variables and corresponding standard errors b1 = invsym(X_dem'*X_dem)*X_dem'*y_dem res2 = cross(y_dem - X_dem*b1, y_dem - X_dem*b1) MSE = res2/(rows(X_dem) - (cols(X_dem)+cols(D)-1)) XX_dem = qrinv(cross(X_dem, X_dem)) SE = sqrt(diagonal(XX_dem) * MSE) SE end

The problem lies when 2 or more fixed effects sets are involved. For two, you need to count the "mobility groups". For more than 2, there is nothing concrete. I describe a small algorithm that can count the non estimable fixed effects, which works in practice on simulations, but have not shown it works in a mathematically rigorous matter.

HTH
Fernando
Comment
Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#3

03 Aug 2019, 00:55

Hi Fernando,

thanks a lot for your reply. I am not planning to extend -areg-, but I would like to understand the underlying mechanics within a simple example, before moving to real data.

About the strategy you proposed, unfortunately it does not address the case when in X there is another set of fixed effects (as in my MWE).

Best wishes,
Stefano
Comment

FernandoRios

Join Date: Apr 2014
Posts: 2469

03 Aug 2019, 05:00

Hi Stefano,
It still works, but you need to count also the number of coefficients not estimated in your set of fixed effects. Look at the following code:

Code:

mata

//import
y=st_data(.,"price")
X=st_data(.,("gear length i.trunk uno"))
D=st_data(.,"ibn.turn")

//demeanedXandy
//"residualmaker"M_D=I-D(D'D)^(-1)D'
M_D=I(rows(y))-D*invsym(D'*D)*D'
y_dem=M_D*y
X_dem=M_D*X
y_dem=y_dem:+(colsum(y):/rows(y))
X_dem=X_dem:+(colsum(X):/rows(X))

//OLSusingde-meanedvariablesandcorrespondingstandarderrors
b1=invsym(X_dem'*X_dem)*X_dem'*y_dem
res2=cross(y_dem-X_dem*b1,y_dem-X_dem*b1)
MSE=res2/(rows(X_dem)-(cols(X_dem)-diag0cnt(invsym(X_dem'*X_dem))+cols(D)-1))
XX_dem=qrinv(cross(X_dem,X_dem))

SE=sqrt(diagonal(XX_dem)*MSE)
SE

end

the function "diag0cnt" counts the zero elements in the main diagonal, which means those cannot be estimated.
HTH
Fernando

Comment

Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#5

03 Aug 2019, 07:00

Hi Fernando,

thanks for your answer, indeed your code returns almost the same result as areg (including the RMSE). The only part that is different is the standard error for the intercept (all other standard errors are identical).
With areg it's 11259.39, while with the MATA code it's 11258.3645.

I believe this is due to the fact that one needs to adjust the degrees of freedom, but I am getting a bit confused as I thought was already taken care of precisely when computing the MSE. Any ideas about this?

Many thanks,
Stefano
Comment

FernandoRios

Join Date: Apr 2014
Posts: 2469

03 Aug 2019, 08:55

Hi Stefano
YOu are absolutely right. I forgot the "double" adjustment that is required:

Code:

mata:

//import
y=st_data(.,"price")
X=st_data(.,("gear length i.trunk uno"))
D=st_data(.,"ibn.turn")

//demeanedXandy
//"residualmaker"M_D=I-D(D'D)^(-1)D'
M_D=I(rows(y))-D*invsym(D'*D)*D'
y_dem=M_D*y
X_dem=M_D*X
y_dem=y_dem:+(colsum(y):/rows(y))
X_dem=X_dem:+(colsum(X):/rows(X))

//OLSusingde-meanedvariablesandcorrespondingstandarderrors
b1=invsym(X_dem'*X_dem)*X_dem'*y_dem
err=y_dem-X_dem*b1
n=rows(y_dem)
k=rows(invsym(X_dem'*X_dem))-diag0cnt(invsym(X_dem'*X_dem))
//incorrect standard errors
VCV=sum(err:^2)/(n-k)*invsym(X_dem'*X_dem)
diagonal(VCV):^.5
// corrected 
VCV=VCV*(n-k)/(n-k-cols(D)+1)
diagonal(VCV):^.5
 

end

I modified the code to show what I have done in my own version of linear models with High order fixed effects.
HTH

Comment

Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#7

03 Aug 2019, 14:23

Hi, of course, it makes total sense. Thanks!

Best wishes,
Stefano
Comment

Announcement