Hello everyone,
I am currently trying to fit an ARIMA model to some traffic accident data but i have some problems in transforming the data to satisfy stationarity assumptions.
I already transformed the data to log and sqrt values (as this should have stabalized the variance) and did first, second and even third order of differencing on normal, log and sqrt values. I even did seasonal differencing on the log values (lag = 7) as suggested in Hyndman and Athanasopoulos, chapter 8.1 (online source: https://otexts.com/fpp2/stationarity.html)
The mean of all of the tranformed variables is always around zero, which is fine. But WN or stationary tests never satisfy the criteria.
For example I used the ADF test on each transformed variable, I never get a T statistic that is below any critical value.
(Example of the code i used on the first order difference of log traffic accidents: dfuller logacc_total_diff1, lags(1)).
When I look at ACFs of any of the transformed variables they, like never lie with the 95% confidence interval.
(Example if the code i used on the seasonal difference (lag = 7) of log traffic accidents: ac logacc_total_diff7)
I have no further idea how to transform the data to satisfy stationarity criteria, maybe one of you guys can help by suggesting any method i haven't tried or correct me if I am doing something wrong.
Another approach I would use is just a simple linear regression model and include trend and seasonality dummies like suggested in Hyndman and Athanasopoulos, chapter 5.4 (online resource: https://otexts.com/fpp2/regression-intro.html). If i got it right, then the data do not has to satisfy stationarity criteria in a linear regression, even when it's time series data.
Some brief facts about the data and the research itself:
- research question is whether time change has an effect on traffic accidents (I am not interested in any forecasting)
- Y is represented by total traffic accidents in Germany per day
- time change is indicated by binary variables (there are different types of time change indicators, e.g. change to standard time or change to daylight saving time, one week prior to a change or after a change and so on; these are not meant to be used in one model at the same time, I just created a variety of dummy variables to be able to create different models)
- weather control variables are added
- weekday dummies are included (normal weekday dummies and dummies that indicate one weekday before/after a time change)
- time series starts in August 2002 and ends in December 2019 (6362 observations)
A sample of the dataset is included and all relevant variables should be labelled.
I use stata 16.0
Thank you to everyone who finds some time to reply, I really appreciate it.
Have a nice weekend everyone!
I am currently trying to fit an ARIMA model to some traffic accident data but i have some problems in transforming the data to satisfy stationarity assumptions.
I already transformed the data to log and sqrt values (as this should have stabalized the variance) and did first, second and even third order of differencing on normal, log and sqrt values. I even did seasonal differencing on the log values (lag = 7) as suggested in Hyndman and Athanasopoulos, chapter 8.1 (online source: https://otexts.com/fpp2/stationarity.html)
The mean of all of the tranformed variables is always around zero, which is fine. But WN or stationary tests never satisfy the criteria.
For example I used the ADF test on each transformed variable, I never get a T statistic that is below any critical value.
(Example of the code i used on the first order difference of log traffic accidents: dfuller logacc_total_diff1, lags(1)).
When I look at ACFs of any of the transformed variables they, like never lie with the 95% confidence interval.
(Example if the code i used on the seasonal difference (lag = 7) of log traffic accidents: ac logacc_total_diff7)
I have no further idea how to transform the data to satisfy stationarity criteria, maybe one of you guys can help by suggesting any method i haven't tried or correct me if I am doing something wrong.
Another approach I would use is just a simple linear regression model and include trend and seasonality dummies like suggested in Hyndman and Athanasopoulos, chapter 5.4 (online resource: https://otexts.com/fpp2/regression-intro.html). If i got it right, then the data do not has to satisfy stationarity criteria in a linear regression, even when it's time series data.
Some brief facts about the data and the research itself:
- research question is whether time change has an effect on traffic accidents (I am not interested in any forecasting)
- Y is represented by total traffic accidents in Germany per day
- time change is indicated by binary variables (there are different types of time change indicators, e.g. change to standard time or change to daylight saving time, one week prior to a change or after a change and so on; these are not meant to be used in one model at the same time, I just created a variety of dummy variables to be able to create different models)
- weather control variables are added
- weekday dummies are included (normal weekday dummies and dummies that indicate one weekday before/after a time change)
- time series starts in August 2002 and ends in December 2019 (6362 observations)
A sample of the dataset is included and all relevant variables should be labelled.
I use stata 16.0
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float date int t str9 weekday int acc_total byte(change_DST change_standard week_before) float(week_after sun mon tue wed thur fri sat wind rain sunshine snow temp logacc_total_diff1 logacc_total_diff2 logacc_total_diff7 sqrtacc_total_diff1) 15632 80 "Saturday" 1324 0 0 0 0 0 0 0 0 0 0 1 3.5 1.5923077 3.72 0 4.892308 -.16658926 -.26953745 -10.41526 -3.160629 15633 81 "Sunday" 819 0 0 1 0 1 0 0 0 0 0 0 3.02 .9384615 3.76 0 4.646154 -.4803286 -.3137393 8.914955 -7.768635 15634 82 "Monday" 1655 0 0 1 0 0 1 0 0 0 0 0 3.42 7.676923 .46 0 8 .7034721 1.1838007 -4.628877 12.063515 15635 83 "Tuesday" 1590 0 0 1 0 0 0 1 0 0 0 0 2.74 5.815384 1.08 0 11.807693 -.04006672 -.7435389 -5.409641 -.8068848 15636 84 "Wednesday" 1502 0 0 1 0 0 0 0 1 0 0 0 5.7 4.5076923 .96 0 9.6 -.05693674 -.016870022 17.060062 -1.1191597 15637 85 "Thursday" 1357 0 0 1 0 0 0 0 0 1 0 0 4.25 1.5846153 4.3 0 6.369231 -.10152102 -.04458427 -20.53279 -1.9181633 15638 86 "Friday" 1811 0 0 1 0 0 0 0 0 0 1 0 4.175 10.946154 1.08 0 9.653846 .28860283 .3901238 14.112476 5.718365 15639 87 "Saturday" 1216 0 0 1 0 0 0 0 0 0 0 1 6.94 4.7 2.72 0 9.6 -.3983126 -.6869154 -7.816172 -7.684654 15640 88 "Sunday" 987 0 1 0 1 1 0 0 0 0 0 0 7.84 9.66923 .48 0 9.83077 -.208652 .18966055 8.630507 -3.4546375 15641 89 "Monday" 1341 0 0 0 1 0 1 0 0 0 0 0 7.42 1 4.54 0 6.976923 .3065009 .51515293 -11.409594 5.203112 15642 90 "Tuesday" 1188 0 0 0 1 0 0 1 0 0 0 0 2.5 2.0153847 4.68 0 6.761539 -.1211443 -.4276452 7.757553 -2.1522903 15643 91 "Wednesday" 1387 0 0 0 1 0 0 0 1 0 0 0 1.76 2.592308 1.18 0 7.546154 .15487194 .27601624 1.8444653 2.775074 15644 92 "Thursday" 1404 0 0 0 1 0 0 0 0 1 0 0 1.64 .4538462 1.18 0 6.792308 .012182236 -.1426897 -9.315535 .22753906 15645 93 "Friday" 1282 0 0 0 1 0 0 0 0 0 1 0 1.96 8.384615 .88 0 8.492308 -.09090424 -.10308647 10.03308 -1.6649628 15646 94 "Saturday" 1438 0 0 0 1 0 0 0 0 0 0 1 4.34 7.276923 .86 0 8.530769 .11483192 .20573616 -6.11927 2.115944 15647 95 "Sunday" 1023 0 0 0 0 1 0 0 0 0 0 0 3.92 8.523077 .84 0 6.015385 -.3405137 -.4553456 .7197323 -5.9366 15648 96 "Monday" 1572 0 0 0 0 0 1 0 0 0 0 0 3.46 2.1076922 1.72 0 4.6615386 .4296093 .770123 5.145613 7.664085 15649 97 "Tuesday" 1205 0 0 0 0 0 0 1 0 0 0 0 1.94 .13076924 1.88 0 2.2153847 -.26586914 -.6954784 -11.529654 -4.935349 15650 98 "Wednesday" 1023 0 0 0 0 0 0 0 1 0 0 0 2.92 1.2 5.12 0 2.0076923 -.16374016 .10212898 16.965975 -2.728737 15651 99 "Thursday" 1716 0 0 0 0 0 0 0 0 1 0 0 2.64 4.5692306 1.78 0 2.323077 .51725626 .6809964 -16.968128 9.440258 15652 100 "Friday" 1699 0 0 0 0 0 0 0 0 0 1 0 4.52 12.223077 1.22 0 4.046154 -.009955883 -.52721214 8.34984 -.2056999 end format %tdMonth_dd,_CCYY date
Have a nice weekend everyone!