areg and xtreg,fe with cluster option: which one is better?

Ines Simac

Join Date: Apr 2014

Posts: 10
#1

areg and xtreg,fe with cluster option: which one is better?

26 May 2014, 09:01

Dear all,

I am doing a FE regression with year and firm fixed effects and tired both:

-xtreg, fe vce(cluster ID)
-areg, absorb(ID) vce(cluster ID)

Both result in different t-values, due to different number of df - as I understand.
But which one is more accurately to use?

Thanks in advance,

Ines
Tags: None
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#2

26 May 2014, 10:37

Ines: I just tried an example on a standard balanced panel data sets and was surprised that areg and xtreg give clustered standard errors that are notably different -- just as you report. I know that there are cases with nonnested clusters where this occurs but I thought they would coincide on a standard panel data set. I know the option "dfadj" used with xtreg makes the standard errors coincide with areg, but I'm not sure why it's necessary for standard panel data applications (or cases where each unit belongs to a unique cluster). I've always thought the xtreg clustered standard errors are fine, so I admit to being a little puzzled.

I tried bootstrapping with 1,000 replications and the bootstrap standard errors are much closer to the xtreg standard errors. Assuming you have "large N, small T" you might want to bootstrap and see what happens. What remains unclear to me is why areg clustered standard errors seem conservative. I suspect someone will soon post the answer.
1 like
Comment
Sophie Beams

Join Date: Jul 2014

Posts: 1
#3

29 Jul 2014, 14:13

I'd recommend reading the Statalist resposne by Ryan Kessler (who actually cites Prof. Wooldridge): http://www.stata.com/statalist/archi.../msg00596.html
1 like
Comment
Alfonso Sánchez-Peñalver

Join Date: Mar 2014

Posts: 432
#4

29 Jul 2014, 18:17

This is interesting. Following the discussion mentioned by Sophie, I have prepared a little example

Code:

clear all set more off sysuse auto keep mpg price rep78 gen time = _n * demeaning variables for later regressions foreach v in mpg price { egen m`v' = mean(`v'), by(rep78) gen dm`v' = `v' - m`v' drop m`v' } * generating the cluster dummies for later regressions quietly tabulate rep78, generate(cid) * setting group and time dimensions for later estimations xtset rep78 time * Estimations with clustered errors areg mpg price, absorb(rep78) vce(cluster rep78) xtreg mpg price, fe vce(cluster rep78) xtreg mpg price, fe vce(cluster rep78) dfadj reg dmmpg dmprice, nocons vce(cluster rep78) reg mpg price cid*, nocons vce(cluster rep78) * Estimations with nonclustered errros areg mpg price, absorb(rep78) xtreg mpg price, fe xtreg mpg price, fe dfadj reg dmmpg dmprice, nocons reg mpg price cid*, nocons

It seems that the degrees of freedom are not adjusted when using xtreg, fe with clustered errors, but they are when using xtreg, fe with nonclustered errors. Notice how the two xtreg, fe estimations with nonclustered errors produce the same results, i.e. those that areg produces, so adding the option dfadj makes no difference. This is not the case with clustered errors as it's been pointed out.

I have included the dummy variable regression and demeaned regressions for the sole purpose of comparison. Notice how when using clustered standard errors, the standard errors of xtreg, fe (without dfadj) do not match of those of the demeaned regression without the dummies. I believe that is because of the inclusion of the constant by xtreg, fe. So it really seems that xtreg, fe is messing up the degrees of freedom when using clustered errors.

Last edited by Alfonso Sánchez-Peñalver; 29 Jul 2014, 18:31.

Alfonso Sanchez-Penalver
1 like
Comment

Announcement

areg and xtreg,fe with cluster option: which one is better?

Comment

Comment

Comment