Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stationary residuals. Spurious regression?

    Greetings Stata Users.

    I'm trying to estimate the public spending role over economic growth. So basically i'm using an inverse Wagner Law to see how those two variables are correlated.

    I have checked either GDP (PIB) and Public Spending, and both of them have unit roots. Integrated in order 2.
    This is consistent with higher R^2 value and probably a spurious regression that shows ahead.

    Code:
    reg PIB_nominal_bm total_gasto_nom
    
          Source |       SS           df       MS      Number of obs   =        28
    -------------+----------------------------------   F(1, 26)        =  11070.25
           Model |  2.0580e+30         1  2.0580e+30   Prob > F        =    0.0000
        Residual |  4.8334e+27        26  1.8590e+26   R-squared       =    0.9977
    -------------+----------------------------------   Adj R-squared   =    0.9976
           Total |  2.0628e+30        27  7.6400e+28   Root MSE        =    1.4e+13
    
    ---------------------------------------------------------------------------------
     PIB_nominal_bm |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
    total_gasto_nom |   5.04e+09   4.79e+07   105.22   0.000     4.94e+09    5.14e+09
              _cons |   3.52e+13   4.05e+12     8.68   0.000     2.69e+13    4.35e+13
    ---------------------------------------------------------------------------------
    Then next codes were

    Code:
    predict u, res
    dfuller u
    
    Dickey-Fuller test for unit root                   Number of obs   =        27
    
                                   ---------- Interpolated Dickey-Fuller ---------
                      Test         1% Critical       5% Critical      10% Critical
                   Statistic           Value             Value             Value
    ------------------------------------------------------------------------------
     Z(t)             -3.732            -3.736            -2.994            -2.628
    ------------------------------------------------------------------------------
    MacKinnon approximate p-value for Z(t) = 0.0037
    Maybe i'm reading wrong the p-value, but i suppose the null hypothesis it's unit roots behaviour.


    So when i ran the stationary test of the residuals of that regression. dfuller test says it's stationary. So i'm a bit confused, does this mean that relation is not spurious and it's telling me the long-term relationships?.


  • #2
    You might want to try an augmented Dickey-Fuller test instead by adding first-differenced lags of the error term with the lags() option of the dfuller command. You could obtain the optimal lag order by calling
    Code:
    varsoc u
    Notice that the varsoc command delivers the optimal lag order for the level error term. The lag order for the dfuller command should be this lag order minus 1.
    https://twitter.com/Kripfganz

    Comment


    • #3
      I've just run the command you gave me in order to establish the lag of dfuller test.

      Code:
      Selection-order criteria
         Sample:  1994 - 2017                         Number of obs      =        24
        +---------------------------------------------------------------------------+
        |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
        |----+----------------------------------------------------------------------|
        |  0 | -755.914                      1.4e+26*  63.0761*  63.0892*  63.1252* |
        |  1 |  -755.91  .00789    1  0.929  1.6e+26   63.1591   63.1852   63.2573  |
        |  2 | -755.901  .01772    1  0.894  1.7e+26   63.2417   63.2808    63.389  |
        |  3 | -755.898  .00475    1  0.945  1.9e+26   63.3249    63.377   63.5212  |
        |  4 | -755.833  .13018    1  0.718  2.0e+26   63.4028   63.4679   63.6482  |
        +---------------------------------------------------------------------------+
      And dfuller test with lag order minus 1.
      Code:
      dfuller u, lags(3)
      
      Augmented Dickey-Fuller test for unit root         Number of obs   =        24
      
                                     ---------- Interpolated Dickey-Fuller ---------
                        Test         1% Critical       5% Critical      10% Critical
                     Statistic           Value             Value             Value
      ------------------------------------------------------------------------------
       Z(t)             -2.510            -3.750            -3.000            -2.630
      ------------------------------------------------------------------------------
      MacKinnon approximate p-value for Z(t) = 0.1131
      So according to that, errors still have an unit roots. How do i proceed?

      Comment


      • #4
        Hi John,

        Have you ever gotten a response to this?

        Comment


        • #5
          John: Why did you choose 3 lags for the dfuller command? Your varsoc command indicates that 0 lags are optimal, which is already quite weird because that would not make much sense if there is serial correlation in the tested variable, and a unit root is an extreme form of serial correlation.

          To answer the initial question: When the (augmented) Dickey-Fuller test indicates that the residuals do not have a unit root, you would conclude that you have estimated a long-run level relationship between the variables in your initial model. Otherwise, there is the risk of a spurious regression. But note that a high R2 can also be the consequence of deterministically trending variables which is quite likely when you have macroeconomic variables that are not detrended. I would recommend to add a linear time trend to your model.

          More generally, I recommend to estimate an error correction model as a reparameterization of an autoregressive distributed lag model. The dynamic nature circumvents potential spurious-regression problems that you would have in static models, and you can still infer the level relationship from the dynamic specification.
          https://twitter.com/Kripfganz

          Comment


          • #6
            I'm confused. Your sample-order criteria suggest no lags (lowest information criteria with *). And then you say a dfuller test with lag order minus 1, but your table shows lag order 3.

            Also, which country has GDP that is I~(2)? I would suggest looking at real GDP, that should be I~(1). Here's a good rule of thumb: If your R-squared is greater than your Durbin Watson, you have a spurious regression. If your regression is spurious, remove the unit roots from the variables until they are stationary and run your regression on those series. Here's a helpful blog post related to this issue.

            Comment

            Working...
            X