Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Which lag-order should be chosen for the ADF?

    Hello everyone,

    i am currently writing my bachelor thesis on the price formation of cryptocurrencies. Therefore i am conducting a time series analysis (either VECM or ARDL). However, i am facing some problems regarding the stationarity of some of my variables. Generally it shouldn't be a problem if some variables are stationary whereas others are integrated of order one, as an ARDL model could easily be applied. As i am comparing the price formation of Bitcoin and Ethereum, I assume the same model should be applied to both cryptocurrencies for the sake of comparability. One of the variables is "views on Wikipedia" as a proxy for public recognition respectively attractiveness. For the "Bitcoin views on Wikipedia" the ADF-test shows different results depending on the lag-order (stationary for low lag order; nonstationary for a high lag-order). Moreover the results for Ethereum suggest that the time series is non-stationary. Here are my questions: 1. Which result of the ADF-test regarding Bitcoin is more reliable/ should be considered? Should the number of lagged differences be chosen according to the AIC? 2. If the model for Bitcoin includes indeed non-stationary variables, I would apply an ARDL model to the Bitcoin data and a VECM to Ethereum (the variables are cointegrated)? Is it possible to compare the results of two different tests?

    The data is log transformed and the observations are on a daily basis. I attached the graph of the "Bitcoin views on Wikipedia" and two test results for different lag-orders.

    Any help is much appreciated!

    Thanks in advance!

    Attached Files

  • #2
    You can use the varsoc command to select the optimal lag order, e.g.
    Code:
    . webuse lutkepohl2
    (Quarterly SA West German macro data, Bil DM, from Lutkepohl 1993 Table E.1)
    
    . varsoc inc, maxlag(8) exog(qtr)
    
       Selection-order criteria
       Sample:  1962q1 - 1982q4                     Number of obs      =        84
      +---------------------------------------------------------------------------+
      |lag |    LL      LR      df    p      FPE       AIC      HQIC      SBIC    |
      |----+----------------------------------------------------------------------|
      |  0 | -513.757                      12607.7   12.2799   12.3032   12.3378  |
      |  1 |  -344.95  337.61    1  0.000  231.985   8.28452   8.31942*  8.37134* |
      |  2 | -344.254  1.3909    1  0.238  233.684   8.29177    8.3383   8.40753  |
      |  3 | -344.226  .05784    1  0.810  239.166   8.31489   8.37306   8.45959  |
      |  4 | -341.495  5.4616*   1  0.019  229.534*  8.27368*  8.34348   8.44731  |
      |  5 | -341.495  .00033    1  0.985  235.097   8.29749   8.37892   8.50006  |
      |  6 | -340.616  1.7578    1  0.185  235.822   8.30037   8.39344   8.53188  |
      |  7 | -340.511  .20914    1  0.647  240.963   8.32169   8.42639   8.58214  |
      |  8 | -340.502  .01897    1  0.890  246.789   8.34528   8.46161   8.63466  |
      +---------------------------------------------------------------------------+
       Endogenous:  inc
        Exogenous:  qtr  _cons
    
    . dfuller inc, lags(3) regress trend
    
    Augmented Dickey-Fuller test for unit root         Number of obs   =        88
    
                                   ---------- Interpolated Dickey-Fuller ---------
                      Test         1% Critical       5% Critical      10% Critical
                   Statistic           Value             Value             Value
    ------------------------------------------------------------------------------
     Z(t)             -2.206            -4.066            -3.462            -3.157
    ------------------------------------------------------------------------------
    MacKinnon approximate p-value for Z(t) = 0.4863
    
    ------------------------------------------------------------------------------
    D.inc        |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             inc |
             L1. |  -.0275985   .0125091    -2.21   0.030    -.0524832   -.0027139
             LD. |   .1139497   .1044827     1.09   0.279    -.0938996    .3217991
            L2D. |    .024097   .1059889     0.23   0.821    -.1867486    .2349427
            L3D. |   .2882646   .1169132     2.47   0.016     .0556869    .5208422
          _trend |    .868694    .344646     2.52   0.014     .1830833    1.554305
           _cons |   10.55553   4.006389     2.63   0.010     2.585544    18.52551
    ------------------------------------------------------------------------------
    Notice that you can add a time trend to varsoc by adding the time identifier as an exogenous variable, here exog(qtr). Furthermore, the optimal lag length supplied to the dfuller command is the one obtained from the varsoc command less one, because first differencing removes the highest lag.

    The choice between the ARDL model and a VECM should not primarily be made based on whether the variables are stationary or not. The important difference between the two is that the ARDL model implicitly assumes that there exists at most one cointegrating relationship between the dependent variable and the (weakly) exogenous regressors, while there can be more than one such cointegrating relationships in the VECM. For more on ardl, also see the following Statalist topic: ARDL in Stata
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Dear Sebastian Kripfganz,
      I am following the example you provided here, yet I fail to understand one point. According to the -varsoc- results it seems that the AIC points on optimal lag= 4 whereas according to HQIC and SBIC it is 2. Eventually you used in the -dfuller- command lag length of 3. What is the reasoning behind this? shouldn't 4 lags would be the more conservative choice in the -dfuller-?
      Thanks,
      Anat

      Comment


      • #4
        For dfuller you have to specify the number of lags for the process in first differences. This is one lag less than the lag order in levels, 3 instead of 4 (based on the AIC).
        https://www.kripfganz.de/stata/

        Comment


        • #5
          Thanks Sebastian Kripfganz, that is clear now. Does this hold for the VAR command? I mean it the -varsoc- points on 4 lags (according to the AIC) 3 lags should be specified?

          Comment


          • #6
            No, for the var command (and also for the vec command) you should specify the lags as proposed by the varsoc command, that is 4.
            https://www.kripfganz.de/stata/

            Comment


            • #7
              Originally posted by Sebastian Kripfganz View Post
              For dfuller you have to specify the number of lags for the process in first differences. This is one lag less than the lag order in levels, 3 instead of 4 (based on the AIC).
              Hi Sebastian. I know this was posted a while back, but do you have a reference that back this comment up.

              Comment

              Working...
              X