About Spurious Regressions

Hachemi SOLTANI

Join Date: Oct 2022

Posts: 14
#1

About Spurious Regressions

08 Jan 2023, 09:43

Hello Stata community,

How can I detect spurious regressions when using the panel models in Stata ?
Tags: panel data, panel models, spurious regression
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#2

08 Jan 2023, 09:56

Here's how. Suppose I want to predict homicides with ice cream sales. I have the weekly homicide rate as my outcome, a weekly vector of sales in millions for each city (let's say) and unit and time indicators. I uncover a significant relationship, with a coefficient of 10. Does this mean that as sales increase by a million dollars, the average change in the homicide rate rises by 10 points? Is ice cream causing an increase in killings?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#3

08 Jan 2023, 09:57

You might not need to. If your panel is with a large cross section, then the time series properties are not important.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#4

08 Jan 2023, 10:01

If your panel is with a large cross section, then the time series properties are not important

I don't understand. Is this a general rule? I mean take the often cited example I gave above. Let's say we have 52 weeks of data and 100 cities. A regression of the kind I describe would be unacceptable, right, even if I had 500 cities?
Comment
Hachemi SOLTANI

Join Date: Oct 2022

Posts: 14
#5

08 Jan 2023, 11:09

Originally posted by Jared Greathouse View Post

Here's how. Suppose I want to predict homicides with ice cream sales. I have the weekly homicide rate as my outcome, a weekly vector of sales in millions for each city (let's say) and unit and time indicators. I uncover a significant relationship, with a coefficient of 10. Does this mean that as sales increase by a million dollars, the average change in the homicide rate rises by 10 points? Is ice cream causing an increase in killings?

Yes, this is a good example to explain the meaning of spurious regressions, but I want to know how this is detected statistically in stata.

For example, in other software, the value of Durbin Watson is compared to the value of R-squared. If the value of Durbin Watson is less than the value R-squared, then there will be a possibility of a spurious regression.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#6

08 Jan 2023, 12:05

I want to know how this is detected statistically in stata

Not possible. As the analyst, you must adjust for all predictors you can that you think are theoretically relevant that would also affect your outcome. But, you can't adjust for everything, so after you've adjusted for everything that you think might be relevant, ultimately, there ain't no way to test for this statistically.

In fact, there's always the possibility of omitted predictors. More precisely, there's always some omitted predictors! The issue is, does this omission make a practical difference.

Last edited by Jared Greathouse; 08 Jan 2023, 12:07.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#7

08 Jan 2023, 21:06

Originally posted by Jared Greathouse View Post

I don't understand. Is this a general rule? I mean take the often cited example I gave above. Let's say we have 52 weeks of data and 100 cities. A regression of the kind I describe would be unacceptable, right, even if I had 500 cities?

The example you give is of spurious correlation. That is, correlation that is driven by some other unobserved/unmeasured factor in your model. Spurious correlation can happen always and everywhere, and generally disappears if you include the omitted factor in your model.

"Spurious regression" is a term coined by Granger, C. W., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of econometrics, 2(2), 111-120
and is a term which has a strict technical meaning in econometrics.

The meaning of "spurious regression" is the spurious correlation that you get when you regress one nonstationary variable on another non-stationary variable.

You can also check out Kolev, G. I. (2011). The" spurious regression problem" in the classical regression model framework. Economics Bulletin, 31(1), 925-937
for a short reading on the matter, and for pointing out that "spurious regression" is intimately related to the estimation method. E.g., it occurs if you use OLS, and it does not occur (in certain conditions) if you use GLS.
Comment
Hachemi SOLTANI

Join Date: Oct 2022

Posts: 14
#8

09 Jan 2023, 10:48

Originally posted by Jared Greathouse View Post

Not possible. As the analyst, you must adjust for all predictors you can that you think are theoretically relevant that would also affect your outcome. But, you can't adjust for everything, so after you've adjusted for everything that you think might be relevant, ultimately, there ain't no way to test for this statistically.

In fact, there's always the possibility of omitted predictors. More precisely, there's always some omitted predictors! The issue is, does this omission make a practical difference.

thank you for your valuable contribution
Comment
Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#9

09 Jan 2023, 11:16

I'll admit, I'm not familiar with the literature on spurious regressions. I just want to point out that you can use a post estimation command to estimate a Durbin Watson statistic. After running a regression, use the following command:

Code:

estat dwatson

Then I think you can just look at the output and check to see if the Durbin Watson statistic is less than the model R-squared.
Comment

Announcement

About Spurious Regressions

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment