Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple independent in time series regression

    Hi all,

    I have calculated variables which I would like to regress on macroeconomic variables like inflation and employment rate growth over a 30 year period.
    In the end I want to obtain residuals which indicate the part not explained by macroeconomic data.

    The model is:
    regress variable Y(t) on X1(t) X2(t) ... X6(t), where Y is the dependent and X1-6 are the independent variables in each period.
    (below the data for Y(t) = Market variance)

    I know that autocorrelation and stationarity are essential to check - however I am confused because some variables tend to be stationary / some not. Durbin Watson further indicate autocorrelation with a value of 2.73 for the non-adjusted model.
    • How do I need to conduct the regression - has anybody a step by step approach?
    • Which model to choose (OLS/VAR/ARIMAX)?
    • Are there any good links out there which handle a problem like mine?
    Thanks,

    Lukas

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float marketvar1 double(inf tbill) float(termprem indprodgrow ccongrow employgrow)
     32.93915  13.5492 13.06667 -1.60667  -3.571994   .7745178    -1.069687
    14.473107 10.33471 15.91083       -2  1.0235256  1.6412628    -.1598251
    27.171177 6.131427 12.27083   .73084  -5.450018  1.7165437   -1.9529936
       7.3375 3.212435 9.066667 2.038333   4.785403   5.421594     .3609022
     14.07939 4.300536   10.365  2.07333   9.775706  4.4819427     3.224122
      11.1487 3.545644   8.0475  2.57583  1.6197594   5.813297    1.2716854
     22.78145 1.898048 6.518333 1.164167  2.1965344   4.542871     .9870806
     42.14015 3.664563 6.860833 1.523334   5.689137  3.1009665    1.5199437
     8.090176 4.077741   7.7275 1.118334    5.29615  4.0147786    1.2802964
    10.663842 4.827003    9.085 -.586667   .8127782   3.432024    1.1847918
    27.305094 5.397956   8.1475    .4025   .8010056   4.115614    -.4186898
    20.868055 4.234964    5.835 2.023333 -1.9034253 -.04099826   -1.6955696
    4.6357465  3.02882 3.681667 3.328333  3.7240565   3.001073   -.22793497
     3.010679 2.951657 3.174167 2.699166   3.550947  2.7693124     .5677073
     8.754811 2.607442 4.629167 2.450833   5.903679   3.654736    1.1149478
    2.3005676  2.80542 5.916667  .663333   5.086609  3.3741674     .7404681
    10.210496 2.931204     5.39 1.048334   4.925306   3.675153     .4464747
    18.756668  2.33769 5.615833  .736667  8.4037895   3.138127     .8829276
     42.14015 1.552279 5.466667   -.2025   6.652367  4.5199494     .4644905
     27.13987 2.188027     5.33  .306667   5.092548   5.053048    .13894258
    35.580956 3.376857 6.455833 -.426666   4.080763  4.3813334     .2008356
     38.47124 2.826171 3.686667 1.330833  -3.648927  2.8988185    -1.294211
    33.120213 1.586032 1.725833    2.885  .50577986    2.51261    -1.649415
    11.092725 2.270095 1.150833 2.864167   1.331691   3.110792    -.9814775
      5.53891 2.677237 1.563333 2.710834    3.13841   3.682374 -.0012074694
     6.102785 3.392747 3.511667  .778333  4.0872436   5.268197     .4320125
     3.879092 3.225944 5.153333 -.361666   2.558301   2.380492     .6527455
     7.762117 2.852673 5.268333 -.639166  2.7572885   2.411894    -.2996788
      35.3852   3.8391    2.965  .701667  -4.768422  1.2100405   -1.2441537
    end
    Last edited by Lukas Meissner; 06 Jan 2019, 06:37.

  • #2
    How do I need to conduct the regression - has anybody a step by step approach?
    I think this depends on your research question. For example, a VAR is well suited for examining dynamic relationships. You might want to look into cointegrating relationships amongst levels of the variables. If there's nothing there, then make all the variables stationary.

    Which model to choose (OLS/VAR/ARIMAX)?
    Same as above. What's the end goal? An ARDL model could be appropriate, so could a VAR. It depends what question you're trying to answer.

    Are there any good links out there which handle a problem like mine?
    I would look at the literature that's related to what you're trying to research. Stand on the shoulders of giants, don't reinvent the wheel. You might find this useful as a step by step approach to ARDL modeling.

    I am confused because some variables tend to be stationary / some not. Durbin Watson further indicate autocorrelation with a value of 2.73 for the non-adjusted model.
    Some econometricians are very strict about all variables being stationary (I(0)). I would suggest removing any unit roots and then running your analysis. I'm not sure what a non-adjusted model is, but a decent range for a Durbin-Watson is 1.6-2.4. Of course, the closer to 2, the better. I tend to see high values when a model is overfit or you have a very short history. You might be using too many lags in your model (assuming you're not solely using contemporaneous values).

    Comment


    • #3
      Hi Justin,

      I appreciate your efforts helping me. Let me start with what I am researching on:

      Research:
      I created an index for several countries (1 index per country) consisting of 4 variables (i.e. market variance) by PCA. As the variables might suffer from macroeconomic impacts, I would like to orthogonalize these variables by regressing them on several macroeconomic data (i.e. inflation, industrial production growth, consumption growth..). Finally, I would like to create an second index based on the regression residuals of all these 4 variables - being the part not explained by macroeconomic predictors in each period t. The authors of my "role model" paper nor in the previous paper detail how they have conducted this step. (Baker / Wurgler / Yuan (2012) Global, local, and contagious investor sentiment, p. 276)

      For each country I have 4 regressions for the period 1980-2008, in the case above it is:
      Market Variance (t) = b0 + b1 * Inflation + b2 * T-Bill + b3 * TermPremium + b4 * Industrial Prod. Growth + b5 * Consumption growth + b6 * Employment rate growth
      As I have 4 countries with 4 variables each this results in 16 models for which the residuals are needed.

      My ultimate goal is to have a simple approach/model due to the amount of models which need to be estimated.

      Cointegration/Regression

      From what I understood Cointegration only works if all variables follow the same trend - meaning I(1). To test this, I use the command dflgs var, maxlag(8) to obtain optimal no. of lags, then dfuller var, drift/trend lag(x). Drift or trend determined based on plot.

      Following these test, Inflation is stationary at lag 1 but Industrial Production and employment rate growth is non-stationary at lag 1. (Lag 1 for ADF test suggested by dfgls test, trend used).
      This indicates to me that not all variables are I(1) and Cointegration does not work. Based on this, I conclude that another model might be better.

      It might be that my approach is wrong or I conduct the Unit Root Test wrong as I am quite a newbie to this.

      I am afraid that ARDL might be to complex given the amount of models which need to be estimated?


      Thanks

      Lukas
      Last edited by Lukas Meissner; 07 Jan 2019, 02:44.

      Comment


      • #4
        Edit to my earlier post:

        In the paper Campbell 1996 Understanding Risk and Return a one lag VAR model is estimated to study the unexpected part of the variable.
        So, I guess that I will do a VAR model and obtain the residuals from this model.

        As not all variables are stationary at level, my initial problem stays the same .. Do I need to difference the non-stationary variables and leave stationary variables as is - so that all variables are stationary?

        Thanks for helping,

        Lukas

        Comment


        • #5
          As not all variables are stationary at level, my initial problem stays the same .. Do I need to difference the non-stationary variables and leave stationary variables as is - so that all variables are stationary?
          From what I've read, most time series econometricians difference non-stationary variables so that all variables are stationary. If you're going to run a VAR, I would also look into innovation accounting (impulse response functions and variance decomposition of forecast errors), and (unrelated to VARs) dynamic factor models (which I've seen used a number of times to construct indices of time series variables, for example here).

          Comment

          Working...
          X