The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

Gyman Tol

Join Date: Dec 2019

Posts: 4
#1

The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

23 Dec 2019, 04:06

Hello everyone,

I am currently running an OLS panel data regression with industry and time fixed effects and I am testing the assumptions of linear regression. I have a sample with 10.000 datapoints over a time period from 2009-2018. When I conduct my tests however, normality is being violated according to the jarque bera test, which is strange because when I plot my residuals, I can see a distribution that is almost perfect (Normality plot .pdf). Furthermore, when I test for heteroscedasticity and autocorrelation, both tests give me a rather high chi-sq and F score (Tests for autocorrelation and heteroscedasticity.pdf). According to the sayings of several forums and articles, I could solve this by running my regressions with: cluster(FirmID) at the end, and so I did. But when I run my regressions in such a matter, I can no longer test for autocorrelation and heteroscedasticity, so how can I be sure that by clustering the standard errors on the firm ID's the problem of heteroscedasticity and autocorrelation is resolved?

Yours sincereley,

Gyman van der Tol.
Attached Files

Tests for autocorrelation and heteroscedasticity.pdf (30.9 KB, 1 view)

Normality plot .pdf (56.1 KB, 1 view)
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

03 Jan 2020, 15:05

Clustering by panel gives you standard error estimates robust to herterscedasticity and serial correlation. With such a large sample, trivial deviations from normality will be statistically significant. This is why folks are careful about such tests when the sample is very large.

By the way, folks often won't open attached files - concerns about viruses.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17701
#3

04 Jan 2020, 02:55

Gyman:
as an aside to Phil's helpful reply:
- did you use -xtreg- or pooled OLS? (please see the FAQ on how to post more effectively, by including what you typed and what Stata gave you back via CODE delimiters. Thanks);
- normality is a (weak) requirement for the distribution of the epsilon error only. As Phil wisely highlighted, with a large sample is frequent that a minimal departure from normality rejects the null. You should not be concerned about that and inspect residual distribution visually rather than analitically. If the distribution looks weird in terms of variance, just impose non.default standard error (ie, cluster robust);
- however, if you impose cluster robust standard errors, there's no scope in testing heteroskedasticity and autocorrelation again, as the results of the tests will be absolutely the same. Indeed, cluster robust standard error corrects coefficient dispersion, not residuals;
- reluctance to open attachments coming from unknown sources due to the risk of imorting active contents obviously applies to Phil, myself and most of Statalisters (see the FAQ again about attachments).

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Gyman Tol

Join Date: Dec 2019

Posts: 4
#4

06 Jan 2020, 07:23

Thank you very much for your responses! I have sorted my regressions specifications out by creating a GLS-regressions with vce(Robust), and as normality is so sensitive to large samples I will visually test it.
Comment

Announcement

The occurance of autocorrelation and heteroscedasticity in a panel data OLS regression

Comment

Comment

Comment