I am conducting ARIMA time series modelling to compute an ex-post-forecast of a monthly time series (April 2011 - Jan 2018). In Nov 2016 a shock occured to the series whose effect I am examining. So, April 2011 to Oct 2016 is my estimation window and for the Nov 2016 to Jan 2018 period I want to conduct a dynamic ARIMA forecast. I already identified an ARIMA(2,1,1) model to be the adequate one for the data and I obtained the point estimates for the post-Nov-2016 period. These is the data and the forecasts:
clear
input double month float(lnVol_PPIs f_lnVol_PPIs)
615 1.1249295 1.1249295 <-- up to month 681 (Oct 2016) the actual and predicted values are identical
616 1.166271 1.166271
617 .9516579 .9516579
618 1.0919234 1.0919234
619 -.3856625 -.3856625
620 .8671005 .8671005
621 .7975072 .7975072
622 -.54472715 -.54472715
623 1.1249295 1.1249295
624 1.160021 1.160021
625 1.0952734 1.0952734
626 1.286474 1.286474
627 1.280934 1.280934
628 1.3110318 1.3110318
629 1.1819606 1.1819606
630 1.3202232 1.3202232
631 1.4038192 1.4038192
632 1.6331544 1.6331544
633 1.7406234 1.7406234
634 1.821972 1.821972
635 1.9712994 1.9712994
636 1.8995472 1.8995472
637 1.9919738 1.9919738
638 2.3373456 2.3373456
639 2.1610427 2.1610427
640 2.218116 2.218116
641 2.2013965 2.2013965
642 2.311545 2.311545
643 2.388037 2.388037
644 2.3799787 2.3799787
645 2.3379292 2.3379292
646 2.2669249 2.2669249
647 2.3780813 2.3780813
648 2.599741 2.599741
649 2.646884 2.646884
650 2.807714 2.807714
651 2.755314 2.755314
652 2.7880056 2.7880056
653 2.8084755 2.8084755
654 2.929057 2.929057
655 2.9959056 2.9959056
656 3.1286116 3.1286116
657 3.4365644 3.4365644
658 3.272572 3.272572
659 3.366606 3.366606
660 3.511317 3.511317
661 3.4115534 3.4115534
662 3.990921 3.990921
663 4.3089423 4.3089423
664 3.84833 3.84833
665 4.015055 4.015055
666 4.1671867 4.1671867
667 3.969315 3.969315
668 4.0606074 4.0606074
669 4.1242795 4.1242795
670 4.137712 4.137712
671 4.2293158 4.2293158
672 4.178202 4.178202
673 4.180052 4.180052
674 4.2773986 4.2773986
675 4.238445 4.238445
676 4.261927 4.261927
677 4.3435493 4.3435493
678 4.3547974 4.3547974
679 4.5672884 4.5672884
680 4.575478 4.575478
681 4.843437 4.843437
682 5.131805 4.916778 <-- here my forecast is starting, actual and forecasted values differ
683 5.564866 4.809718
684 5.689668 5.061981
685 5.634844 4.993319
686 5.835081 5.073544
687 5.864288 5.191876
688 5.627923 5.159405
689 5.543809 5.287757
690 5.599319 5.321162
691 5.565058 5.360728
692 5.481844 5.458627
693 5.501973 5.479792
694 5.464526 5.555726
695 5.767844 5.616961
696 5.889433 5.65839
end
format %tmMon_CCYY month
Currently, I am struggeling with estimating the forecast intervals that reflect the forecast errors of my point predictions. As the forecast error is an increasing function of the forecast distance, the forecast intervals should widen over time. Modeling the ARIMA(2,1,1) in SPSS (that computes the confidence intervalsautomatically), this is indeed the case:

Unfortunately, I do not know how to reproduce these intervals in Stata. I tried it in two ways:
1) Using predict function in Stata, predicting the MSE:
preserve
drop if month > 681
arima lnVol_PPIs, arima (2,1,1)
restore
predict volpred_dyn, y dynamic(682)
predict MSE_dyn, mse dynamic(682)
gen upper_dyn = volpred_dyn + 1.96 * sqrt(MSE_dyn)
gen lower_dyn = volpred_dyn - 1.96 * sqrt(MSE_dyn
This delivers:
So, just the confidence interval for November 2016 is correctly specified. The intervals for predictions further away are too narrow.
2) Using simulation with forecast function:
preserve
drop if month > 681
arima lnVol_PPIs, arima (2,1,1)
restore
estimates store ARIMA211
forecast create PPImodel, replace
forecast estimates ARIMA211, name(DlnVol_PPIs)
forecast identity lnVol_PPIs = DlnVol_PPIs + L.lnVol_PPIs
forecast solve, simulate(betas, statistic(stddev, prefix(sd_))) begin(682)
This delivers:
Obviously, the confidence intervals now are considerably too small, especially in the beginning.
Do you have any suggestion how to obtain the correctly specified forecast intervals after ARIMA modeling in Stata? I did a lot of research, but did not find any solution. So I am thankful for any hint.
clear
input double month float(lnVol_PPIs f_lnVol_PPIs)
615 1.1249295 1.1249295 <-- up to month 681 (Oct 2016) the actual and predicted values are identical
616 1.166271 1.166271
617 .9516579 .9516579
618 1.0919234 1.0919234
619 -.3856625 -.3856625
620 .8671005 .8671005
621 .7975072 .7975072
622 -.54472715 -.54472715
623 1.1249295 1.1249295
624 1.160021 1.160021
625 1.0952734 1.0952734
626 1.286474 1.286474
627 1.280934 1.280934
628 1.3110318 1.3110318
629 1.1819606 1.1819606
630 1.3202232 1.3202232
631 1.4038192 1.4038192
632 1.6331544 1.6331544
633 1.7406234 1.7406234
634 1.821972 1.821972
635 1.9712994 1.9712994
636 1.8995472 1.8995472
637 1.9919738 1.9919738
638 2.3373456 2.3373456
639 2.1610427 2.1610427
640 2.218116 2.218116
641 2.2013965 2.2013965
642 2.311545 2.311545
643 2.388037 2.388037
644 2.3799787 2.3799787
645 2.3379292 2.3379292
646 2.2669249 2.2669249
647 2.3780813 2.3780813
648 2.599741 2.599741
649 2.646884 2.646884
650 2.807714 2.807714
651 2.755314 2.755314
652 2.7880056 2.7880056
653 2.8084755 2.8084755
654 2.929057 2.929057
655 2.9959056 2.9959056
656 3.1286116 3.1286116
657 3.4365644 3.4365644
658 3.272572 3.272572
659 3.366606 3.366606
660 3.511317 3.511317
661 3.4115534 3.4115534
662 3.990921 3.990921
663 4.3089423 4.3089423
664 3.84833 3.84833
665 4.015055 4.015055
666 4.1671867 4.1671867
667 3.969315 3.969315
668 4.0606074 4.0606074
669 4.1242795 4.1242795
670 4.137712 4.137712
671 4.2293158 4.2293158
672 4.178202 4.178202
673 4.180052 4.180052
674 4.2773986 4.2773986
675 4.238445 4.238445
676 4.261927 4.261927
677 4.3435493 4.3435493
678 4.3547974 4.3547974
679 4.5672884 4.5672884
680 4.575478 4.575478
681 4.843437 4.843437
682 5.131805 4.916778 <-- here my forecast is starting, actual and forecasted values differ
683 5.564866 4.809718
684 5.689668 5.061981
685 5.634844 4.993319
686 5.835081 5.073544
687 5.864288 5.191876
688 5.627923 5.159405
689 5.543809 5.287757
690 5.599319 5.321162
691 5.565058 5.360728
692 5.481844 5.458627
693 5.501973 5.479792
694 5.464526 5.555726
695 5.767844 5.616961
696 5.889433 5.65839
end
format %tmMon_CCYY month
Currently, I am struggeling with estimating the forecast intervals that reflect the forecast errors of my point predictions. As the forecast error is an increasing function of the forecast distance, the forecast intervals should widen over time. Modeling the ARIMA(2,1,1) in SPSS (that computes the confidence intervalsautomatically), this is indeed the case:
Unfortunately, I do not know how to reproduce these intervals in Stata. I tried it in two ways:
1) Using predict function in Stata, predicting the MSE:
preserve
drop if month > 681
arima lnVol_PPIs, arima (2,1,1)
restore
predict volpred_dyn, y dynamic(682)
predict MSE_dyn, mse dynamic(682)
gen upper_dyn = volpred_dyn + 1.96 * sqrt(MSE_dyn)
gen lower_dyn = volpred_dyn - 1.96 * sqrt(MSE_dyn
This delivers:
So, just the confidence interval for November 2016 is correctly specified. The intervals for predictions further away are too narrow.
2) Using simulation with forecast function:
preserve
drop if month > 681
arima lnVol_PPIs, arima (2,1,1)
restore
estimates store ARIMA211
forecast create PPImodel, replace
forecast estimates ARIMA211, name(DlnVol_PPIs)
forecast identity lnVol_PPIs = DlnVol_PPIs + L.lnVol_PPIs
forecast solve, simulate(betas, statistic(stddev, prefix(sd_))) begin(682)
This delivers:
Obviously, the confidence intervals now are considerably too small, especially in the beginning.
Do you have any suggestion how to obtain the correctly specified forecast intervals after ARIMA modeling in Stata? I did a lot of research, but did not find any solution. So I am thankful for any hint.
Comment