forecast with hlot-winters seasonal smoothing using panel data

高佳

Join Date: Jan 2016

Posts: 80
#1

forecast with hlot-winters seasonal smoothing using panel data

17 Jan 2016, 09:31

Hi guys, its my first post on statalist, hope this can help me to work out this problem. here it is:

I want to forecast a seasonal sales with the hlot-winters seasonal smoothing model for over a thousand companies and each have a at most 29 quarters(some are less than 29 quarters horizon,
tsset stkid Gorder, quarterly
tssmooth shwinters swh1=tinc
the result keep showing "backed up" and won't converge

attached is the data for this problem, any suggestion and help would be welcomed and deeply appreciated
Attached Files

data for holt's linear exponential smoothing.dta (571.4 KB, 1 view)
Tags: None
高佳

Join Date: Jan 2016

Posts: 80
#2

17 Jan 2016, 18:32

could anyone help me out on this?
Comment
高佳

Join Date: Jan 2016

Posts: 80
#3

18 Jan 2016, 18:49

O come on!!! even on the statslist can't I use any help?!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#4

19 Jan 2016, 07:59

Various reasons why you may not have received an answer:

1. For whatever reason, not many active members here seem to use these methods. Almost everyone responds just to whatever interests them, presumably also your rule for answering questions on forums.

2. People don't like being hassled. If later posts don't add substance to earlier posts, they like it even less. No one here is obliged to answer any question whatsoever. (Saying "come on" etc. might just work if people recognised you as someone who had posted answering many questions, but it is not advised when you are new.)

3. Your name suggests a pseudonym: we do request full real names here. (If your name really is "Goal Jason", I messed up.)

4. Poor spelling and punctuation really don't help.

#1 is by far the most important.

I tried the method on your data and got nowhere. I don't know why. I would suggest working on log scale as many of these series are extraordinarily spikey, but in experiments that didn't seem to help either.

You need a expert on Holt-Winters, and there's none in sight. Sorry to disappoint.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

19 Jan 2016, 11:15

For stkid==1 I noticed the large sums of squares, thought precision might be some sort of issue, and divided tinc by 10,000,000, which then led to convergence. But not for stkid==2. And in retrospect, I'm dismayed that changing scale made a difference at all. I don't know if this speaks to weakness in the methodology, problems in the implementation, or an inappropriate application, but there you are.

Like Nick, I'm no expert in Holt-Winters, but I have to wonder if applying apparently fairly high-powered methodology to 7 years of quarterly data is within the design range of the methodology.
Comment
高佳

Join Date: Jan 2016

Posts: 80
#6

19 Jan 2016, 18:41

Originally posted by Nick Cox View Post

Various reasons why you may not have received an answer:

1. For whatever reason, not many active members here seem to use these methods. Almost everyone responds just to whatever interests them, presumably also your rule for answering questions on forums.

2. People don't like being hassled. If later posts don't add substance to earlier posts, they like it even less. No one here is obliged to answer any question whatsoever. (Saying "come on" etc. might just work if people recognised you as someone who had posted answering many questions, but it is not advised when you are new.)

3. Your name suggests a pseudonym: we do request full real names here. (If your name really is "Goal Jason", I messed up.)

4. Poor spelling and punctuation really don't help.

#1 is by far the most important.

I tried the method on your data and got nowhere. I don't know why. I would suggest working on log scale as many of these series are extraordinarily spikey, but in experiments that didn't seem to help either.

You need a expert on Holt-Winters, and there's none in sight. Sorry to disappoint.

Thanks Nick, your suggestions are all very helpful. For name, the truth is I'm a Chinese and I already sent an email to the administrator see if he/she can change it into my Chinese one.

I also want to explain for my words because this problem really bothered me for quite a while and the deadline is approaching. But I would take the kindly advice from you and any others to behave better here.
Comment
高佳

Join Date: Jan 2016

Posts: 80
#7

19 Jan 2016, 19:35

Originally posted by William Lisowski View Post

For stkid==1 I noticed the large sums of squares, thought precision might be some sort of issue, and divided tinc by 10,000,000, which then led to convergence. But not for stkid==2. And in retrospect, I'm dismayed that changing scale made a difference at all. I don't know if this speaks to weakness in the methodology, problems in the implementation, or an inappropriate application, but there you are.

Like Nick, I'm no expert in Holt-Winters, but I have to wonder if applying apparently fairly high-powered methodology to 7 years of quarterly data is within the design range of the methodology.

Lots of thanks William, I also noticed the large RSS and got confused by that, may be you are right , I should try it. How about regularization or normalization(change all the observations into a (0 ,1) range)?

I tried to divide tinc by 100,000,000, and set the max iteration to be 30, which led to convergence for some stkid, and all these convergent data have relatively small RSS. The convergence tolerances need to be less than 1*e-5. So I would try to regularize all the data and the run Holt-Winters again.

" if applying apparently fairly high-powered methodology to 7 years of quarterly data is within the design range of the methodology."Actually this is what really concerns me.

Thanks again.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

20 Jan 2016, 09:52

I must say, I have found tssmooth shwinters to be very peculiarly behaved, but your data also has its pecularities.

To start with, since this is a multiplicative model, I think you would want to normalize not to (0,1) but rather to something like [1,10].

As a start at that, I did

Code:

// within each stkid transform tinc from [a,b] to [1,b/a]
bysort stkid: egen double mintinc = min(tinc)
generate double tincm = tinc/mintinc
codebook tinc tincm

which as expected gave me a minimum of 1, but the codebook showed that there was a maximum of 51000, which is a huge range.

Code:

. // suspicious looking data: max is > 51000 * min
. list if tincm > 51000, clean

              tinc   order   stkid   Gorder     mintinc     tincm  
13552.   9.332e+10   13552     628   1966q2   1826774.3   51082.2  

. list if stkid==628, clean

              tinc   order   stkid   Gorder     mintinc       tincm  
13528.   1826774.3   13528     628   1960q2   1826774.3           1  
13529.   1.700e+10   13529     628   1960q3   1826774.3   9306.2983  
13530.   1.712e+10   13530     628   1960q4   1826774.3   9373.0188  
13531.   1.813e+10   13531     628   1961q1   1826774.3   9924.2899  
13532.   1.794e+10   13532     628   1961q2   1826774.3   9821.7109  
13533.   2.001e+10   13533     628   1961q3   1826774.3   10952.407  
13534.   1.983e+10   13534     628   1961q4   1826774.3   10854.067  
13535.   1.963e+10   13535     628   1962q1   1826774.3   10743.521  
13536.   1.886e+10   13536     628   1962q2   1826774.3   10322.952  
13537.   1.982e+10   13537     628   1962q3   1826774.3   10848.017  
13538.   2.062e+10   13538     628   1962q4   1826774.3   11285.121  
13539.   2.018e+10   13539     628   1963q1   1826774.3   11046.473  
13540.   1.967e+10   13540     628   1963q2   1826774.3   10768.344  
13541.   2.435e+10   13541     628   1963q3   1826774.3    13328.75  
13542.   2.494e+10   13542     628   1963q4   1826774.3   13650.154  
13543.   2.405e+10   13543     628   1964q1   1826774.3   13165.332  
13544.   2.555e+10   13544     628   1964q2   1826774.3   13986.428  
13545.   2.461e+10   13545     628   1964q3   1826774.3   13472.491  
13546.   2.550e+10   13546     628   1964q4   1826774.3   13960.852  
13547.   2.488e+10   13547     628   1965q1   1826774.3   13621.706  
13548.   2.477e+10   13548     628   1965q2   1826774.3   13559.351  
13549.   2.469e+10   13549     628   1965q3   1826774.3   13514.845  
13550.   9.243e+09   13550     628   1965q4   1826774.3   5059.7791  
13551.   1.598e+10   13551     628   1966q1   1826774.3   8746.2144  
13552.   9.332e+10   13552     628   1966q2   1826774.3     51082.2  
13553.   3.890e+10   13553     628   1966q3   1826774.3   21292.037  

.

So some of your problems may well be due to your data. At a minimum, you need to better investigate each panel of your data.

Now for the part I can't demonstrate. At one point I ran

Code:

tssmooth shwinters swhm1=tincm if stkid==1

and the result was that it converged nicely! But then I immediately ran

Code:

tssmooth shwinters swhm1=tincm

and on the first panel (stkid==1) it failed to converge, even though it seemed to start with the same sequence of RSS as the previous run!

Worse yet I have been unable to reproduce this behavior. It's like I got one chance at getting convergence, and then the convergence gods frowned on me and went away.

I don't know what to say. I'm not fond of the apparent dependence of the algorithm on scaling and of its unpredictable nature. I am reasonably certain I haven't made any errors that led to the non-reproduceable result.

Comment

高佳

Join Date: Jan 2016

Posts: 80
#9

20 Jan 2016, 19:39

Thanks William, I compared your result with mine which is using the normalize to (0,1)method and find out that almost all the data that have relatively larger tincm also have bad forecast value,
So next I will take your advice to screen each panel of my data to clear those abnormal ones out then rerun the H-W algorithm.

But something weird happened when I rerun your code :
// within each stkid transform tinc from [a,b] to [1,b/a] bysort stkid: egen double mintinc = min(tinc) generate double tincm = tinc/mintinc codebook tinc tincm with the same data(I re-download it from this post),and codebook show me an maximum of tincm which is 46881 rather than 51000. maybe it's just something trivial but I'm wondering why.

And one last question is why should I "normalize not to (0,1) but rather to something like [1,10]"? is that because this is a multiplicative model and normalize to [0,1] is actually letting the left hand side of equation to be 0 when estimating the parameters?

some other weird thing you may interest in : I run the code you wrote above and set a maximum iteration time(40) to let the process move on in case non-convergence occur. You know what? it's converge for some stkid but not all.

Last edited by 高佳; 20 Jan 2016, 20:34.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#10

21 Jan 2016, 19:21

I got a fresh download of your data from the link in post #1 and immediately reran my code; I again got 51082.2 as the maximum, and the next-smaller value was 21,292, so I don't know how 46881 could arise.

I recommend normalizing data going into a multiplicative model to start at 1 because that can be done with a multiplicative change (change of scale) which should have no substantive effect on the fit of a multiplicative model. (Remember, we're normalizing in hopes of reducing problems possibly caused by the very large scale.) Like changing meters to kilometers for example. But to normalize the data to start at 0 (if that's not the minimum to begin with) requires an additive change (change of location) which will affect the fit of a multiplicative model. Like changing distance south of the north pole to distance south of the Arctic Circle.

I am not surprised that some would converge and some not. Disappointed, but not surprised. I think what is required is a much better understanding of Holt-Winters methodology than the Stata documentation can give, or than can be found (apparently) on Statalist, at least among those chose to read and responded to your post. Perhaps the paper linked here can give some further insight.
Comment

Announcement