Auto-ARIMA is now available in Stata! The new arimaauto and xtarimau commands

Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#16

11 Oct 2022, 15:23

Hello everyone, I fixed the bug for the hegy package (arimaauto now offers to install it manually). Kit Baum wil hopefully update the version in the SSC soon.
All other described problems are Stata version related, please compile the .mlib file yourself (simply run this file in Stata and place the resulting .mlib from the working directory or "l" subfolder into your personal folder – type "sysdir") using

https://github.com/econcz/stata-arim...arimaauto.mata

After some consideration and compatibility problems, I decided to set the version of arimaauto to Stata 16.
If you have a lower version of Stata, you can try changing the versions in the ado and mata file and save/compile them. Simply:

Code:

which arimaauto doedit {the displayed path}

and follow the procedure described above.
You can save the result into your personal folder to protect against package updates.

PS I have Stata 17 and it can only compile code for version 17.
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#17

11 Oct 2022, 15:25

Originally posted by Akash Issar View Post

Hi Ilya Bolotov ! Thank you for bringing auto.arima from R into Stata. I am using Stata 17.0 and I encountered an error when I use the "if" condition to run arimaauto on a train set. Here is the example using the "data_uk.dta" provided with the hegy package.

Code:

. sysuse data_uk.dta // Create a training set indicator . gen insample = (time<=tm(2010m12)) // The regular arimaauto works fine without the "if" condition arimaauto luk // Below I use the "if" condition to run autoarima on the training set . arimaauto luk if insample==1 Error: convergence failure or at least one eigenvalue is at least .999001 // I receive the above error .

I tried the running arimaauto on other datasets too and I receive the error. However, the R command `auto.arima()` allows to run on a train set.

Looking forward to your help with this issue.

Thanks

A very good question! arimaauto uses Stata's arima command, so I guess the ML estimation in Stata and R is different in some way :-)
I don't think it's an error.
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#18

15 Oct 2022, 10:53

If someone already compiled larimaauto.mlib for Stata 16, could you please send the file to me? I'll ask Kit Baum to fix the version in SSC.
Thank you in advance!

Last edited by Ilya Bolotov; 15 Oct 2022, 10:57.
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#19

16 Oct 2022, 12:22

Please ignore my last message; I got my hands on an old Stata 15 installer (I still have the license from the purchase) and managed to compile arimaauto with it.
Surprisingly, it seems to work well, so I downgraded the program's requirements to version 15 and sent the update to Kit Baum to be uploaded to SSC. It should appear soon under version 1.0.4. This time all files are mutually consistent, and no errors should arise (fingers crossed).

Those who have an older Stata edition (13 and 14), please follow the advice in this post.

Many thanks for everybody's feedback!
Comment
Mario Ferri

Join Date: Jul 2019

Posts: 190
#20

09 Feb 2023, 08:40

Originally posted by Ilya Bolotov View Post

Please ignore my last message; I got my hands on an old Stata 15 installer (I still have the license from the purchase) and managed to compile arimaauto with it.
Surprisingly, it seems to work well, so I downgraded the program's requirements to version 15 and sent the update to Kit Baum to be uploaded to SSC. It should appear soon under version 1.0.4. This time all files are mutually consistent, and no errors should arise (fingers crossed).

Those who have an older Stata edition (13 and 14), please follow the advice in this post.

Many thanks for everybody's feedback!

Illya two technical questions
1.
Do data have first to be turned into stationary series or it can handle that by itself?
2.
Can it implement arima with Kalman Filter (with the diffuse option, idealy)?

3.
What is the prediction equation used here?

Last edited by Mario Ferri; 09 Feb 2023, 08:44.
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#21

27 May 2023, 12:46

Originally posted by Mario Ferri View Post

Illya two technical questions
1.
Do data have first to be turned into stationary series or it can handle that by itself?
2.
Can it implement arima with Kalman Filter (with the diffuse option, idealy)?

3.
What is the prediction equation used here?

Dear Mario,

It's literally a Mata wrapper for Stata's built-in arima command.
In other words, arimaauto uses the arima command to make all estimations.
Comment
bechir bouzid

Join Date: Feb 2019

Posts: 5
#22

13 Dec 2023, 21:13

Ilya Bolotov thanks again for sharing this super useful command with the community .

I have a quick question. I am working with Panel data to run arimaauto on CPI variable for each of my countries using monthly data ( to be used for forecasting purposes).

However, when I run the arimaauto command for each country I get the following message (in bold) ;

. arimaauto cpi, if country=="Albania"
"something that should be true of your data is not"

Note that, when I try to run a loop with the [foreach] command , I do not get any output ( the latter approach with the loop is of course the ideal command for me compared to the [if country==] command above .

Thanks for any help or guidance on how to implement this function on panel data with a separate output for each country ( my .dta is attached).

Best
I
Attached Files

TEST___ CPI__2024.dta (17.4 KB, 1 view)
Comment
Al Molina

Join Date: Oct 2022

Posts: 2
#23

09 Jan 2024, 01:02

Huuu. I only have STATA version 14. :*(
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#24

16 Jan 2024, 14:46

Originally posted by bechir bouzid View Post

Ilya Bolotov thanks again for sharing this super useful command with the community .

I have a quick question. I am working with Panel data to run arimaauto on CPI variable for each of my countries using monthly data ( to be used for forecasting purposes).

However, when I run the arimaauto command for each country I get the following message (in bold) ;

. arimaauto cpi, if country=="Albania"
"something that should be true of your data is not"

Note that, when I try to run a loop with the [foreach] command , I do not get any output ( the latter approach with the loop is of course the ideal command for me compared to the [if country==] command above .

Thanks for any help or guidance on how to implement this function on panel data with a separate output for each country ( my .dta is attached).

Best
I

Have you tried the xtarimau command? It is a panel version of the arimaauto :-)
https://ideas.repec.org/c/boc/bocode/s459048.html
1 like
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#25

16 Jan 2024, 14:48

Originally posted by Al Molina View Post

Huuu. I only have STATA version 14. :*(

The arimaauto should work under Stata 14.0, but you will need to compile the mlib file manually. Please install the arimaauto from SSC, then download and run this file in Stata:
https://github.com/econcz/stata-arim...arimaauto.mata
Comment

Henry Cust

Join Date: Nov 2021
Posts: 18

#26

19 Jan 2024, 07:00

Hi,

Thanks for making this command available on Stata, it has been incredibly useful for me.

I have a query about the unit root section of the automation that doesn't make sense to me. I was attempting to replicate the dfgls test that the command automatically runs. My arimauto code runs as:

Code:

arimaauto var

Snippet of output:

HTML Code:

Model1: ARIMA(2,0,2) with constant
Model2: ARIMA(0,0,0) with constant
Model3: ARIMA(1,0,0) with constant
Model4: ARIMA(0,0,1) with constant

Unit root tests:

             |  unit root |  lags |      Stat    1% crit    5% crit   10% crit
-------------+------------+-------+-------------------------------------------
var |            |       |                                          
       DFGLS |          0 |     7 |  3.074446   3.339556   4.247613   0.163282
        KPSS |          0 |     7 |  0.093091   0.216000   0.146000   0.119000

The issue is the stat, 1% crit, 5% crit, 10% crit in the DFGLS column. First of all, the numbers aren't ordered which is my first red flag.

When I duplicate the DFGLS using:

Code:

    dfgls var, notrend
    mat li r(results)
    
    dfgls var, maxlag(`r(maiclag)') notrend

it gives the output:

HTML Code:

DF-GLS test for unit root                   Number of obs = 78
Variable: var
Lag selection: Schwert criterion            Maximum lag   = 11

                             -------- Critical value ---------
 [lags]       DF-GLS mu           1%           5%          10%
--------------------------------------------------------------
     11           0.047       -2.604       -1.987       -1.690
     10           0.329       -2.604       -2.004       -1.706
      9           0.093       -2.604       -2.021       -1.723
      8           0.175       -2.604       -2.039       -1.740
      7           0.163       -2.604       -2.056       -1.757
      6          -0.284       -2.604       -2.073       -1.774
      5           0.775       -2.604       -2.090       -1.790
      4           0.272       -2.604       -2.106       -1.805
      3           0.243       -2.604       -2.122       -1.819
      2           0.266       -2.604       -2.136       -1.832
      1          -0.108       -2.604       -2.148       -1.844
--------------------------------------------------------------
Opt lag (Ng–Perron seq t) = 11 with RMSE = 4.095471
Min SIC  =  3.177813 at lag  1 with RMSE =  4.63229
Min MAIC =  3.074446 at lag  7 with RMSE = 4.247613

.         mat li r(results)

r(results)[11,5]
             k        MAIC         SIC        RMSE       DFGLS
r1          11   3.1020417   3.4900263   4.0954709   .04668016
r1          10   3.1352375   3.4824285   4.1954913   .32857683
r1           9   3.1220409   3.4490061   4.2428145   .09280341
r1           8   3.1005003    3.395332   4.2474441   .17454652
r1           7   3.0744456   3.3395563    4.247613    .1632823
r1           6   3.1288425   3.3595467   4.4117877  -.28377006
r1           5   3.1960485   3.3657744   4.5508839   .77511198
r1           4   3.1588123    3.331803   4.6009528   .27169396
r1           3   3.1321191    3.276169   4.6014617   .24250955
r1           2   3.1065012   3.2203139    4.601462   .26597661
r1           1   3.0920902   3.1778133   4.6322902  -.10766455

The first output picks lag = 7 from the minimised MAIC (arimaauto does the same). However, at lag 7 (and all other lags), we fail to reject the null that implies a unit root and requires differencing. What is reported in the arimaauto output is from the r(results), i.e. the MAIC, SIC, RMSE, DFGLS at lag 7, rather than the critical values it implies it is reporting.

My question is: Is this intentional? If not, is arimaauto not applying differencing in the correct manner because it isn't pulling the correct critical values?

Thanks in advance
Henry

P.S. I'm using version 18 MP

Last edited by Henry Cust; 19 Jan 2024, 07:05. Reason: verison clarification

Comment

Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#27

09 Feb 2024, 09:02

Dear Henry,

There was a bug in retrieving critical values for the DF-GLS test because of my efforts to make arimaauto reverse compatible with Stata 15. As you (an not only you) have correctly observed, they do not make sense because they are taken from a wrong matrix. I apologize for the mistake.
Please find the corrected Stata 15-compatible Mata library (the mlib file) along with the ado and sthlp files here: https://github.com/econcz/stata-arimaauto (you can download the 1.0.5 release as a ZIP file). These files should be put into your PERSONAL folder (type sysdir to locate it). I'm sending the code to Kit Baum as well for it to be uploaded to the SSC.

It is now officially impossible to automatically collect critical values for the DF-GLS test under Stata 15, please run

Code:

arimaauto var, trace(1)

to perform a manual evaluation.

arimaauto is an algorithm performing tests, based on existing commands, and passing variables to Stata's built-in arima program. The logic is the following: 1) perform the Hegy test, 2) perform the DF-GLS and KPSS tests, 3) select the best ARIMA model based on the HK algorithm from R.

I recommend running

Code:

arimaauto var, trace(1)

for test output instead of manual replication.

Take care,
Ilya

Last edited by Ilya Bolotov; 09 Feb 2024, 09:10.
Comment
Hsin-Yun Kuan

Join Date: Feb 2024

Posts: 3
#28

19 Feb 2024, 08:16

Inquiry on ARIMAAuto::put(): 3301 subscript invalid Error in Stata 16

Originally posted by hee sun View Post

Dear Ilya,

I'm using Stata 15.1 and I received the following error:

Code:

. sysuse gnp96.dta, clear . arimaauto gnp96 ARIMAAuto::put(): 3301 subscript invalid <istmt>: - function returned error

Is there anything I can do to fix the problem?

Thank you

Dear Ilya,

I wish to extend my deepest appreciation for your diligent work on enhancing the arimaauto command. Your commitment to improving our analytic tools is genuinely commendable.

Currently, I use Stata version 16 and have encountered a technical challenge that seems to be shared by other users: an error message displaying

Code:

ARIMAAuto::put(): 3301 subscript invalid. <istmt>: - function returned error r(3301);

If you have any suggestions or potential solutions to overcome this obstacle, it would be of great assistance. Your expertise in this matter is highly valued.

Thank you in advance for considering my request, and I look forward to any advice you may be able to provide.

Best regards,
Hsin-Yun
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#29

19 Feb 2024, 15:52

Dear Hsin-Yun,

I know it seems trivial, but have you tried restarting Stata (closing and re-opening it)?
Also please check your configuration of c(level).

Kind regards,
Ilya

Last edited by Ilya Bolotov; 19 Feb 2024, 15:55.
1 like
Comment
Ilya Bolotov

Join Date: Nov 2018

Posts: 75
#30

01 Apr 2024, 13:08

Update

Following the feedback, arimaauto and xtariamu will display full HEGY errors and warn about panel and non-panel data, starting from versions 1.0.6 and 1.0.3 (I've submitted both to the SSC).
No more confusing messages! :-)
xtarimau will also offer a simple sdtesti-based test of heterogeneity in models across the panel.

Many thanks to Kit Baum for updating the files in the SSC archives!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment