Robust or Clustered Standard Errors

Barbara Hama

Join Date: Feb 2019

Posts: 10
#1

Robust or Clustered Standard Errors

28 Mar 2019, 03:57

Dear all,

I am currently examining the impact of annual average sunset time on sleep duration of children in 4 developing countries.
I have a panel data set over 3 years (2009, 2013 and 2016).
The variable "sleep" denotes the hours per day allocated to sleep by child i in country c in studysite s at time t.
The variable "annual average sunset" only varies at studysite level, so it denotes the average annual sunset time in studysite s in country c.

I ran the following regressions:

*OLS
eststo m1: regress sleep annual_avg_sunset if in_model_3==1
estadd local fe No
estadd local fe_ No

*OLS with control variables
eststo m2: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year, vce(robust)
estadd local fe Yes
estadd local fe_ No

*FE(country&year with robust SE)
eststo m3: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)
estadd local fe Yes
estadd local fe_ Yes

*FE(country&year SE clustered at the studysite year level)
eststo m4: regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(cluster studysite_year)
estadd local fe Yes
estadd local fe_ Yes

Model (3) uses robust standard errors, Model (4) uses clustered standard errors at the studysite_year level. In Model (4) my coefficient on annual average sunset time becomes insignificant and I get a large standard error.

Now I am wondering if I should cluster standard errors and if so, at what level. Does it make sense to cluster it at the studysite_year level in my exampl

Thank you,

Barbara
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

28 Mar 2019, 10:41

Barbara:
just one step behind: if you have panel data with continuous regressand, why using -regress- as your first choice when -xtreg- is available?

Kind regards,
Carlo
(Stata 19.0)
Comment
Barbara Hama

Join Date: Feb 2019

Posts: 10
#3

29 Mar 2019, 03:41

Hi Carlo,

I did use xtreg when I wanted to include child fixed effects. Then I did set my data xtset child_ID year. (I have a panel data over 3 years for N= 19134 observations)
When I wanted to include only age or country fixed effects I used the dummy variable method since I cannot set my data xtset country year...I tried to do this but stata gave me the following error message: repeated time values within panel

I am very new to Stata, so I thought pooled OLS with the dummy variable method might be the right thing to do if I want to include those fixed effects in my regressions.

Best,
Barbara
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

29 Mar 2019, 06:19

Barbara:
provided that you do not have genuine duplicates (ie, mistaken data entry) and you do not plan to use time-series related commands, such as lags and leads, you can -xtset- your data with -panelid- only.

Kind regards,
Carlo
(Stata 19.0)
Comment
Barbara Hama

Join Date: Feb 2019

Posts: 10
#5

29 Mar 2019, 06:31

Thank you for your reply, Carlo.
I will do that.

Can you tell me if the regressions (1) -(4) are ok and if I should use robust standard errors or cluster them?

Best,
Barbara
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

29 Mar 2019, 08:13

Barbara:
I'm not clear whether you measured the same sample of children during a three year timespan or not.

Kind regards,
Carlo
(Stata 19.0)
Comment
Barbara Hama

Join Date: Feb 2019

Posts: 10
#7

29 Mar 2019, 09:08

Yes, I did.
I have three observations for every child, one for 2009, one for 2012 and one for 2016.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#8

29 Mar 2019, 09:54

Barbara:
thanks for providing further details.
Some comments about your query:
- as you have N>T panel dataset, -xtreg- should be your first choice;
- if you detect heteroskedasticity and/or autocorrelation in your data, I would run the following code:

Code:

xtset children year xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)

-robust- impose the cluster-robust standard errors on -panelid- (as it should usually be the way to go).

Kind regards,
Carlo
(Stata 19.0)
Comment
Barbara Hama

Join Date: Feb 2019

Posts: 10
#9

29 Mar 2019, 10:12

Thank you so much for your help, Carlo!

Actually, I do not really understand the difference between these two codes:

(1) regress sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)

(2) xtset childid year
xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)

Could it be that with code (1), Stata doesn't treat the dataset as panel data? But both regressions include year- and country-fixed effects, right?

Best,
Barbara
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

29 Mar 2019, 10:27

Barbara:
your first code you actually runs a pooled OLS, which is not the first choice when you have panel dataset.
The code I suggested runs a linear panel data regression with random effect and cluster robust standard error.
I would recommend you to take a look at -xtreg- entry in Stata .pdf manual and at this valuable textbook for Stata user dealing with econometrics: https://www.stata.com/bookstore/micr...metrics-stata/

Kind regards,
Carlo
(Stata 19.0)
Comment
Barbara Hama

Join Date: Feb 2019

Posts: 10
#11

29 Mar 2019, 10:50

May I ask why you chose a regression with random effects?

I just performed both, a fixed effects and a random effects regression.

xtset childid year
xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, re
xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, fe

According to the Hausman test, I should go with fixed effects.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#12

29 Mar 2019, 13:17

Barbara:
just to give you an example.
Go -fe- if -hausman- points you out to it.
I see that you used default standard errors in your codes.
Hence, I assume that you did not detect heteroskedasticity and/or autocorrelation in your panel dataset.
However, if you want to compare -fe- vs -re- with non-default standard errors (as you cannot go -hausman- with default standard errors and then invoke non-default standard errors thereafter), you can rely on the community-contributed command -xtoverid- (that, being a bit old-fashioned, does not allow -fvvarlist- notation. A feasible trick is to prefix your code with -xi:-).

Kind regards,
Carlo
(Stata 19.0)
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#13

06 Aug 2020, 00:14

Dear Carlo
Sincere apologies for opening this thread and directly addressing the question to you, but I guess you may be able to clear my doubts which is based on some your comments. In this post,#8 you mentioned that,

if you detect heteroskedasticity and/or autocorrelation in your data, I would run the following code:

And the commands you gave is

Code:

xtset children year xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(robust)

Based on my learning from this forum this code is similar and identical to

Code:

xtreg sleep annual_avg_sunset age wi hhsize typesite elec i.year i.country, vce(cluster children)

where childer is panelvar.
Now my doubt is since vce option gives robust standard errors it accounts for heteroskedaticity too? Right,
In the link, https://www.statalist.org/forums/for...th-vce-cluster
@Carlo #4, you have mentioned

under -regress-, -vce(robust)- accounts for hetreoskedasticity in residual distribution, whereas -vce(cluster)- accounts for residual autocorrelation.

.
Ofcourse one is pooled ols and the other is Panel reg but still doesnt vce option accounts both heteroskedasticity and autocorrelation
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#14

06 Aug 2020, 01:10

Ial:
1) -regress-: if you detect both heteroskedasticity and autocorrelation, you should go -vce(cluster);
2) -xtreg-: if you detect both heteroskedasticity and/or autocorrelation, you can go -robust- or -vce(cluster panelid).

Kind regards,
Carlo
(Stata 19.0)
Comment
lal mohan kumar

Join Date: May 2019

Posts: 265
#15

06 Aug 2020, 01:19

Ok Carlo, thanks for your answer
Comment

Announcement

Robust or Clustered Standard Errors

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment