ARDL in stata - Statalist

wanhaiyou

Join Date: May 2014
Posts: 130

#406

13 Jul 2022, 06:30

Originally posted by Sebastian Kripfganz View Post

The equations on page 12 are derived from the equation on page 5. With ec1, when the lag order of ln_inc is zero, its first lag is included in the error-correction form. However, in terms of the coefficients of the equation on page 5, this first lag has a coefficient equal to zero. Because this coefficient equals zero, we need the restriction for the coefficients on page 12, which I mentioned in my previous post. Put differently, for your model there is a total of 8 coefficients in the level equation on page 5:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4)

In the ec1 representation, page 12, you have 9 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1

For the two models to coincide, there must be 1 restriction on the coefficients in the latter version of the model.

If you do not want to have this restriction, either estimate it with option ec, which again gives you 8 coefficients:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec

or allow for 1 unrestricted lag of ln_inc in the model:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 1 4)
ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec1
ardl ln_consump ln_inc ln_inv, lags(1 1 4) ec

In the latter case, no restriction is needed. There are 9 coefficients in each version of the model. But obviously, because we allow for a nonzero coefficient of the first lag in the level version, the estimates differ.

Hi dear Prof Kripfganz,
I am running ARDL model with the following two commands. I don't understand why different results are produced.

Code:

clear
set seed 1234
set obs 1000
gen y = uniform()
gen x1 =  rt(5)
gen time = _n
tsset time
local ylist y
local xlist x1 
local quantile "0.1 0.25 0.5"

    
local nq: word count `quantile'
di `nq'

local xnum: word count `xlist'  
di `xnum'

tempname opt

local maxlags = 10
if ("`maxlags'" != "") {
  ardl `ylist' `xlist', maxlag(`maxlags') aic ec1
  mat `opt' = e(lags)
}

mat list `opt'

The results are as follows

Code:

ARDL(1,1) regression

Sample:        11 -      1000                   Number of obs     =        990
                                                R-squared         =     0.5314
                                                Adj R-squared     =     0.5300
Log likelihood = -180.88505                     Root MSE          =     0.2911

------------------------------------------------------------------------------
         D.y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
           y |
         L1. |  -1.062711   .0318044   -33.41   0.000    -1.125124   -1.000299
-------------+----------------------------------------------------------------
LR           |
          x1 |
         L1. |   .0191763   .0099109     1.93   0.053    -.0002725    .0386252
-------------+----------------------------------------------------------------
SR           |
          x1 |
         D1. |   .0070096    .007535     0.93   0.352    -.0077768     .021796
             |
       _cons |   .5206504   .0181471    28.69   0.000      .485039    .5562619
------------------------------------------------------------------------------
.   mat `opt' = e(lags)
. }

. 
. mat list `opt'

__000000[1,2]
     y  x1
r1   1   1

As shown above, the optimal lags are 1 and 1. Then I run

Code:

ardl y x1, lags(1 1) ec1  // optimal lag equals to `opt'
ARDL(1,1) regression

Sample:         2 -      1000                   Number of obs     =        999
                                                R-squared         =     0.5334
                                                Adj R-squared     =     0.5320
Log likelihood =  -186.9914                     Root MSE          =     0.2924

------------------------------------------------------------------------------
         D.y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
           y |
         L1. |  -1.066707   .0316454   -33.71   0.000    -1.128807   -1.004608
-------------+----------------------------------------------------------------
LR           |
          x1 |
         L1. |   .0186711   .0098246     1.90   0.058    -.0006082    .0379504
-------------+----------------------------------------------------------------
SR           |
          x1 |
         D1. |   .0084251   .0074896     1.12   0.261    -.0062722    .0231223
             |
       _cons |   .5239664   .0181209    28.92   0.000     .4884069     .559526
------------------------------------------------------------------------------

I don't know why the sample obs used are different. One is 990, and the other is 999.
Which answer is correct?

Bests,
wanhai

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#407

13 Jul 2022, 07:46

In your first code, you have set

Code:

local maxlags = 10

This sets aside the first 10 observations in the estimation sample for computing the lags, even if the optimal lag order is smaller than 10.

In your second code, you pre-specify the lag order to be (1, 1). Here, only 1 observation is set aside.

https://www.kripfganz.de/stata/
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#408

13 Jul 2022, 08:55

Originally posted by Sebastian Kripfganz View Post

In your first code, you have set

Code:

local maxlags = 10

This sets aside the first 10 observations in the estimation sample for computing the lags, even if the optimal lag order is smaller than 10.

In your second code, you pre-specify the lag order to be (1, 1). Here, only 1 observation is set aside.

Great thanks! Yes, I understand this, but I am question about which one is "right".

In fact, although the max lag is set to 10 for choosing the optimal lag, the sample used to estimate should based on the final optimal lag step. Is it right?

Thanks for your time!

Bests,
wanhai
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#409

13 Jul 2022, 09:03

The optimal lag orders are found based on the sample which sets aside the first 10 observations. If you set aside fewer initial observations, it is possible that you get different optimal lag orders. For comparison of the models with selection criteria (AIC, BIC), the same sample must be used.

Once you have chosen the optimal lag order, you could then use all observations as in your second code for the subsequent analysis.

https://www.kripfganz.de/stata/
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#410

13 Jul 2022, 15:54

Originally posted by Sebastian Kripfganz View Post

The optimal lag orders are found based on the sample which sets aside the first 10 observations. If you set aside fewer initial observations, it is possible that you get different optimal lag orders. For comparison of the models with selection criteria (AIC, BIC), the same sample must be used.

Once you have chosen the optimal lag order, you could then use all observations as in your second code for the subsequent analysis.

I see, great thanks for your timely help!

Bests,
wanhai
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#411

13 Jul 2022, 20:34

Originally posted by Sebastian Kripfganz View Post

The difficulty here is that the optimal lag order for ln_inc is zero, but you are forcing it to appear in the first lag in the EC representation (option ec1). This is achieved by effectively imposing a constraint as follows:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec1 constraint 1 L.ln_inc = D.ln_inc cnsreg D.ln_consump L.ln_consump L.ln_inc L.ln_inv D.ln_inc L(0/3)D.ln_inv if e(sample), constraints(1)

The long-run coefficients in the ardl output are a nonlinear coefficient combination:

Code:

nlcom (- _b[L.ln_inc] / _b[L.ln_consump]) (- _b[L.ln_inv] / _b[L.ln_consump])

If you use the ec instead of the ec1 option, this complication does not arise and you can replicate the results directly with regress:

Code:

ardl ln_consump ln_inc ln_inv, lags(1 0 4) ec reg D.ln_consump L.ln_consump ln_inc ln_inv L(0/3)D.ln_inv if e(sample) nlcom (- _b[ln_inc] / _b[L.ln_consump]) (- _b[ln_inv] / _b[L.ln_consump])

Dear professor, I have another question. As you mentioned in the previous post, when the optimal lag order for x is zero, and we still estimate the model in the EC representation (option ec1).
At this time, this is achieved by effectively imposing a constraint. However, some routines, such as sqreg (quantile regression), the option constraints(1) is not allowed. That is, we cannot
estimate this model with constraint. Do you have any suggestions?

Greats thank!

wanhai
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#412

14 Jul 2022, 00:51

In that case, you would need to impose a minimum lag order of 1. That might be a little less efficient but I would not worry much about it.

https://www.kripfganz.de/stata/
Comment
wanhaiyou

Join Date: May 2014

Posts: 130
#413

14 Jul 2022, 01:14

Originally posted by Sebastian Kripfganz View Post

In that case, you would need to impose a minimum lag order of 1. That might be a little less efficient but I would not worry much about it.

OK, I do appreciate your help! Thanks for your time!

Bests,
wanhai
Comment
Louise Jensen

Join Date: Apr 2021

Posts: 2
#414

23 Sep 2022, 05:15

POST #324:

Originally posted by Sebastian Kripfganz View Post

Here is an algorithm that should work:

Code:

webuse lutkepohl2 ardl ln_consump ln_inc ln_inv, ec loc varname "ln_inv" local lag = el(e(lags), 1, colnumb(e(lags), "`varname'")) local --lag // the number of lags in the EC form is one less than the number of lags in levels if `lag' >= 0 { loc lagsum "D.`varname'" loc laglist "D.`varname'" forv l = 1/`lag' { loc lagsum "`lagsum' + L`l'D.`varname'" loc laglist "`laglist' L`l'D.`varname'" } lincom `lagsum' test `laglist' }

ardl stores the number of lags in the level representation of the model in the matrix e(lags). The relevant lag order can be extracted from that matrix. The above code can be easily adjusted to work with the ec1 instead of the ec option by removing the if `lag' >= 0 condition.

Dear Professor Kripfganz

I have used your code above to get a lincom result for each ardl regression, however my problem is that I have 133 ardl regressions and at least 133 results. I would like to get the lincom results as an output to Latex, but I can't seem to figure it out. Do you know how to use e.g. outreg for the lincom results for each regression? I am currently using esttab for the regressions results.

Thanks in advance.

My code:

forval i = 1/133{
eststo res`i': ardl fpidif faoalldif oildif exc_div if (id == `i'), lags(4 4 4 4) aic ec
loc varname "faoalldif"

local lag = el(e(lags), 1, colnumb(e(lags), "`varname'"))
local --lag // the number of lags in the EC form is one less than the number of lags in levels

if `lag' >= 0 {
loc lagsum "D.`varname'"
loc laglist "D.`varname'"
forv l = 1/`lag' {
loc lagsum "`lagsum' + L`l'D.`varname'"
loc laglist "`laglist' L`l'D.`varname'"
}
lincom `lagsum'
test `laglist'
}
}
esttab using ardl_log.tex, se starlevels(* 0.1 * 0.05 ** 0.01) replace tex

Best regards
Louise
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#415

27 Sep 2022, 04:13

Since esttab is a wrapper for estout, you should be able to add r() results to your output table. As this is not specifically related to the ardl package, I recommend to start a new topic where others with more experience in this regard might be able to help. Alternatively, have a look at Ben Jann's new sttex package: https://www.statauk.timberlake-confe...om/proceedings

Last edited by Sebastian Kripfganz; 27 Sep 2022, 04:16.

https://www.kripfganz.de/stata/
Comment

Olav Hose

Join Date: Nov 2019
Posts: 13

#416

03 Apr 2023, 03:56

Hello Sebastian, first of all, thank you very much for all replies in this thread. They have been very useful for me. I am trying to perform the ARDL model in the case of finding any evidence of short-run and long-run effects how changes in the exchange rate on the trade balance. I use the trade balance as dependent variable and the exchange rate and real GDP for Sweden and the Euro area as independent variables. However, I am not sure if I should use both ardl lnC lnGDPEA lnGDPSE lnREBEX, lags(7 2 1 4) and ardl lnC lnGDPEA lnGDPSE lnREBEX, lags(7 2 1 4) ec to estimate both short-run and long-run effects? I mean, the first equation to estimate short-run effects, and the second equation to estimate long-run effects. I know that short-run estimates are presented in the second equation, but they differ from the one of the first equation? The question is, which table should I use to interpret the short-run effects?

Thank you very much.

Table 1

ARDL(7,2,1,4) regression
Sample: 1996q4 thru 2022q3		Number of obs	= 104
		F(17, 86)	= 1220.17
		Prob > F	= 0.0000
		R-squared	= 0.9959
		Adj R-squared	= 0.9951
Log likelihood = 214.77285		Root MSE	= 0.0337

lnC Coefficient	Std. err.	t	P>t	[95% conf.	interval]

lnC
L1. 1.629482	.0956644	17.03	0.000	1.439307	1.819656
L2. -.8719938	.1541952	-5.66	0.000	-1.178524	-.565464
L3. .1916431	.1215106	1.58	0.118	-.0499119	.4331981
L4. .535612	.1160356	4.62	0.000	.3049408	.7662832
L5. -1.222609	.1205385	-10.14	0.000	-1.462232	-.9829863
L6. 1.093014	.1469945	7.44	0.000	.8007987	1.38523
L7. -.4022034	.0893606	-4.50	0.000	-.5798465	-.2245603
lnGDPEA
-. -.005067	.3845419	-0.01	0.990	-.769511	.759377
L1. .1248658	.3936466	0.32	0.752	-.6576778	.9074094
L2. .1047407	.2237506	0.47	0.641	-.3400608	.5495422
lnGDPSE
-. -.0580641	.4912167	-0.12	0.906	-1.034571	.9184424
L1. -.1288229	.4985562	-0.26	0.797	-1.11992	.862274
lnREBEX
-. .1235633	.1862408	0.66	0.509	-.2466711	.4937977
L1. .0304421	.2450259	0.12	0.901	-.4566532	.5175374
L2. -.2024231	.2440323	-0.83	0.409	-.6875432	.2826969
L3. -.0788881	.2484374	-0.32	0.752	-.5727653	.4149891
L4. .0777871	.1745874	0.45	0.657	-.2692811	.4248554
_cons -.1458299	.6765544	-0.22	0.830	-1.490776	1.199116

Table 2

ARDL(7,2,1,4) regression
Sample: 1996q4 thru 2022q3		Number of obs = 104
		R-squared = 0.6862
		Adj R-squared = 0.6242
Log likelihood = 214.77285		Root MSE = 0.0337

D.lnC Coefficient	Std. err.	t	P>t	[95% conf. interval]

ADJ
lnC
L1. -.0470552	.0201807	-2.33	0.022	-.0871731 -.0069373

LR
lnGDPEA 4.771832	6.996606	0.68	0.497	-9.13696 18.68062
lnGDPSE -3.971654	3.18677	-1.25	0.216	-10.30674 2.363435
lnREBEX -1.052353	2.293742	-0.46	0.648	-5.612162 3.507455

SR
lnC
LD. .6765369	.0943641	7.17	0.000	.4889473 .8641265
L2D. -.1954569	.0869558	-2.25	0.027	-.3683194 -.0225944
L3D. -.0038138	.0732101	-0.05	0.959	-.1493507 .1417231
L4D. .5317981	.0708273	7.51	0.000	.3909981 .6725982
L5D. -.6908108	.084403	-8.18	0.000	-.8585985 -.5230231
L6D. .4022034	.0893606	4.50	0.000	.2245603 .5798465
lnGDPEA
D1. -.2296065	.434838	-0.53	0.599	-1.094036 .6348229
LD. -.1047407	.2237506	-0.47	0.641	-.5495422 .3400608
lnGDPSE
D1. .1288229	.4985562	0.26	0.797	-.862274 1.11992
lnREBEX
D1. .173082	.1681912	1.03	0.306	-.161271 .507435
LD. .2035241	.1683902	1.21	0.230	-.1312245 .5382726
L2D. .0011009	.171996	0.01	0.995	-.3408158 .3430177
L3D. -.0777871	.1745874	-0.45	0.657	-.4248554 .2692811
_cons -.1458299	.6765544	-0.22	0.830	-1.490776 1.199116

Last edited by Olav Hose; 03 Apr 2023, 04:04.

Comment

Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#417

03 Apr 2023, 08:43

The second regression gives you the short-run effects conditional on the long-run relationship. That is, these short-run effects are the remaining effects after accounting for the adjustment to any deviation from the long-run equilibrium; the latter adjustment is governed by the speed-of-adjustment coefficient.

In the first regression, these effects are mixed up, which makes it difficult to interpret the coefficients, but this level representation is still fine for prediction purposes.

Please also see our forthcoming Stata Journal article:
Kripfganz, S., and D. C. Schneider (forthcoming). ardl: Estimating autoregressive distributed lag and equilibrium correction models. Stata Journal

https://www.kripfganz.de/stata/
1 like
Comment
Olav Hose

Join Date: Nov 2019

Posts: 13
#418

05 Apr 2023, 01:24

Originally posted by Sebastian Kripfganz View Post

The second regression gives you the short-run effects conditional on the long-run relationship. That is, these short-run effects are the remaining effects after accounting for the adjustment to any deviation from the long-run equilibrium; the latter adjustment is governed by the speed-of-adjustment coefficient.

In the first regression, these effects are mixed up, which makes it difficult to interpret the coefficients, but this level representation is still fine for prediction purposes.

Please also see our forthcoming Stata Journal article:
Kripfganz, S., and D. C. Schneider (forthcoming). ardl: Estimating autoregressive distributed lag and equilibrium correction models. Stata Journal

Thank you very much, it is very helpful. I do have another question, I have realised that many of my different ARDL estimations lacks a long-run relationship due to for example the F-test being below the lower bound. However, I do realise that when using the first difference of all the variables in the equation, the model become significant. My question is therefore, could you use the first difference of all variables in an ARDL instead of using the variables in level, or would that destroy any potential relationship in the long-run? If it is possible to use the first difference variables, would you still recommend using with EC1? All my data is either I(0) or I(1) (the large majority I(1)).

Thank you once again!
Comment
Alex Grisdale

Join Date: Apr 2023

Posts: 6
#419

05 Apr 2023, 11:16

Dear community,
I need your help.
I am currently running a regression on corruptions effect on economic growth in Malaysia. The data for corruption is only available from 1995 to 2020, so I only have 25 observations. I have 5 independent variables and when I attempt to find the optimal number of lags using the ardl command it comes up with this:
note: L2.lnGOV omitted because of collinearity.
note: L3.lnGOV omitted because of collinearity.
note: L4.lnGOV omitted because of collinearity.
note: Trad omitted because of collinearity.
note: L.Trad omitted because of collinearity.
note: L2.Trad omitted because of collinearity.
note: L3.Trad omitted because of collinearity.
note: L4.Trad omitted because of collinearity.

Any suggestions on what I can do to fix this problem would be much appreciated.

Last edited by Alex Grisdale; 05 Apr 2023, 11:19.
Comment
Ronak Parikh

Join Date: Apr 2023

Posts: 9
#420

05 Apr 2023, 12:21

Hi, I have been running an ARDL model, but I have recently found there to be serial correlation. Is there a way to incorporate HAC standard errors into the ARDL model using the 'newey' term, when also using an error correction term (since cointegration is present)?

Thanks!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment