Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • About Spurious Regressions

    Hello Stata community,

    How can I detect spurious regressions when using the panel models in Stata ?


  • #2
    Here's how. Suppose I want to predict homicides with ice cream sales. I have the weekly homicide rate as my outcome, a weekly vector of sales in millions for each city (let's say) and unit and time indicators. I uncover a significant relationship, with a coefficient of 10. Does this mean that as sales increase by a million dollars, the average change in the homicide rate rises by 10 points? Is ice cream causing an increase in killings?

    Comment


    • #3
      You might not need to. If your panel is with a large cross section, then the time series properties are not important.

      Comment


      • #4
        If your panel is with a large cross section, then the time series properties are not important
        I don't understand. Is this a general rule? I mean take the often cited example I gave above. Let's say we have 52 weeks of data and 100 cities. A regression of the kind I describe would be unacceptable, right, even if I had 500 cities?

        Comment


        • #5
          Originally posted by Jared Greathouse View Post
          Here's how. Suppose I want to predict homicides with ice cream sales. I have the weekly homicide rate as my outcome, a weekly vector of sales in millions for each city (let's say) and unit and time indicators. I uncover a significant relationship, with a coefficient of 10. Does this mean that as sales increase by a million dollars, the average change in the homicide rate rises by 10 points? Is ice cream causing an increase in killings?
          Yes, this is a good example to explain the meaning of spurious regressions, but I want to know how this is detected statistically in stata.

          For example, in other software, the value of Durbin Watson is compared to the value of R-squared. If the value of Durbin Watson is less than the value R-squared, then there will be a possibility of a spurious regression.

          Comment


          • #6
            I want to know how this is detected statistically in stata
            Not possible. As the analyst, you must adjust for all predictors you can that you think are theoretically relevant that would also affect your outcome. But, you can't adjust for everything, so after you've adjusted for everything that you think might be relevant, ultimately, there ain't no way to test for this statistically.

            In fact, there's always the possibility of omitted predictors. More precisely, there's always some omitted predictors! The issue is, does this omission make a practical difference.
            Last edited by Jared Greathouse; 08 Jan 2023, 12:07.

            Comment


            • #7
              Originally posted by Jared Greathouse View Post
              I don't understand. Is this a general rule? I mean take the often cited example I gave above. Let's say we have 52 weeks of data and 100 cities. A regression of the kind I describe would be unacceptable, right, even if I had 500 cities?
              The example you give is of spurious correlation. That is, correlation that is driven by some other unobserved/unmeasured factor in your model. Spurious correlation can happen always and everywhere, and generally disappears if you include the omitted factor in your model.

              "Spurious regression" is a term coined by Granger, C. W., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of econometrics, 2(2), 111-120
              and is a term which has a strict technical meaning in econometrics.

              The meaning of "spurious regression" is the spurious correlation that you get when you regress one nonstationary variable on another non-stationary variable.

              You can also check out Kolev, G. I. (2011). The" spurious regression problem" in the classical regression model framework. Economics Bulletin, 31(1), 925-937
              for a short reading on the matter, and for pointing out that "spurious regression" is intimately related to the estimation method. E.g., it occurs if you use OLS, and it does not occur (in certain conditions) if you use GLS.


              Comment


              • #8
                Originally posted by Jared Greathouse View Post
                Not possible. As the analyst, you must adjust for all predictors you can that you think are theoretically relevant that would also affect your outcome. But, you can't adjust for everything, so after you've adjusted for everything that you think might be relevant, ultimately, there ain't no way to test for this statistically.

                In fact, there's always the possibility of omitted predictors. More precisely, there's always some omitted predictors! The issue is, does this omission make a practical difference.
                thank you for your valuable contribution

                Comment


                • #9
                  I'll admit, I'm not familiar with the literature on spurious regressions. I just want to point out that you can use a post estimation command to estimate a Durbin Watson statistic. After running a regression, use the following command:

                  Code:
                  estat dwatson
                  Then I think you can just look at the output and check to see if the Durbin Watson statistic is less than the model R-squared.

                  Comment

                  Working...
                  X