Dear Statalisters,
I am running into an issue with rolling windows and I need your kind help with this. I am trying to calculate the exponentially weighted moving average of the squared residuals (X2) ,from a regression that I ran, in a 22 overlapping-rolling window. Here are some of the variables from my data but we only care about X2 for now:
By overlapping-rolling window I mean the first window starts at day 1 and ends at day 22. The second window starts at day 2 and ends at day 23, the third window from 3 to 24 and so on. So I already have X2 and I need to weight X2s in an exponentially declining function such that recent X2s are getting greater weights. To illustrate, we start with the most recent date which is day 22 of each window and calculate the weights as follows:
Day 22 Weight (W22)= (1-ʎ)ʎ0 where ʎ = 0.95, and thus W22 = (1-0.95) 0.950 = 0.05.
W21= (1-ʎ)ʎ1 = (1-0.95)*0.951 = 0.0475 or simply W21 = W22 * ʎ = 0.05*0.95 = 0.0475
W20= (1-ʎ)ʎ2 = (1-0.95)*0.952 = 0.0451 or W20= W21*0.95= 0.0451
W19= (1-ʎ)ʎ3 = (1-0.95)*0.953= 0.0429 or W19= W20*0.95= 0.0429 and so on until W1 .
Now, we have a corresponding weight for each X2. Next, we generate a new variable, call it WX2, which is nothing but the product of each X2 and its own weight. The final step is to sum up variable WX2 for all 22 days, and we call this variable VOL. The square root of VOL is basically my exponentially weighted moving average, but calculating this should not be an issue.
We repeat the process for the next 22 days, from day 2 to day 23, then the next 22 days from 3 to 24, and so on for each firm in my data since my data is panel data. Now at the end of each firm, we might now have 22 days but that it ok, we will go with whatever days left.
It is important to note that for each firm, we will not have any observation for the first 21 days for variable VOL, as we lost them in calculating the first VOL (the first summation is from 1 to 22 so the first observation takes place in day 22).
For simplicity, here is an example of a 10 days overlapping-rolling window:

So the goal here is to get a time series of VOL from X2, X2 in the above example. My first window is the blue window from day 1 to day 10. I first calculate the weights for the 10 days window stating from the most recent day which is day 10 since my dates are sorted in a ascending order. Given that Lambda ʎ = 0.95, the weight for day 10 is as we mentioned esrlier
W10 = (1-0.95) 0.950 = 0.05.
W9 = (1-0.95)*0.951 = 0.0475
.
.
W1 = (1-0.95)*0.959 = 0.031512
Then we multiply each weight by its X2 under a new variable X2*W1. Summing this colume from X2*W1 at day 1 to X2*W1 at day 10 gives us our first VOL observation at day 10. As we can see, in order to get VOL10 we lost the first 10 observations. We do the same process to get VOL11 by taking a window from day 2 to day 11 ( brown window) and VOL12 from day 3 to day 12 ( green window). We keep doing the same process until the last possible observation for each firm. This rolling window must be done using a firm id to ensure that there is no overlapping between firms. Once I have a time series of VOL, I then take the square root it and gives me the exponentially weighted moving average of X2.
I have used Mr. Cox's code for a non-overlapping rolling regression
I have also used Mr. Shah's code in calculating non-overlapping means
but I am not sure if there is a way to make any of these codes overlapping and second how to work these codes to have it create an overlapping rolling window weights, do the multiplication of X2 and the weights (W), and finally sums up the window. Long a ago, I remember reading a suggestion by Mr. Schechter regarding modifying he rangestata command to allow for an overlapping rolling window by taking lag of the variable but I could not find the post. I have looking for it two days now but with no luck.
This is very important to me and any help is highly appreciated
Thank you
I am running into an issue with rolling windows and I need your kind help with this. I am trying to calculate the exponentially weighted moving average of the squared residuals (X2) ,from a regression that I ran, in a 22 overlapping-rolling window. Here are some of the variables from my data but we only care about X2 for now:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int Timestamp long Company float(id residual X2) 18535 1 1 -.3732804 .13933825 18536 1 1 -.4201257 .17650563 18539 1 1 -.4297678 .18470033 18540 1 1 -.4525483 .20479997 18541 1 1 -.50208974 .25209412 18542 1 1 -.5432706 .29514292 18543 1 1 -.5648846 .3190946 18546 1 1 -.5315696 .28256625 18547 1 1 -.5799347 .3363243 18548 1 1 -.58587825 .3432533 18549 1 1 -.6337441 .4016315 18550 1 1 -.6192474 .3834674 18553 1 1 -.6366866 .40536985 18554 1 1 -.6670792 .4449947 18555 1 1 -.7865576 .6186729 end format %tdnn/dd/CCYY Timestamp label values Company Company label def Company 1 "AAK.ST", modify
By overlapping-rolling window I mean the first window starts at day 1 and ends at day 22. The second window starts at day 2 and ends at day 23, the third window from 3 to 24 and so on. So I already have X2 and I need to weight X2s in an exponentially declining function such that recent X2s are getting greater weights. To illustrate, we start with the most recent date which is day 22 of each window and calculate the weights as follows:
Day 22 Weight (W22)= (1-ʎ)ʎ0 where ʎ = 0.95, and thus W22 = (1-0.95) 0.950 = 0.05.
W21= (1-ʎ)ʎ1 = (1-0.95)*0.951 = 0.0475 or simply W21 = W22 * ʎ = 0.05*0.95 = 0.0475
W20= (1-ʎ)ʎ2 = (1-0.95)*0.952 = 0.0451 or W20= W21*0.95= 0.0451
W19= (1-ʎ)ʎ3 = (1-0.95)*0.953= 0.0429 or W19= W20*0.95= 0.0429 and so on until W1 .
Now, we have a corresponding weight for each X2. Next, we generate a new variable, call it WX2, which is nothing but the product of each X2 and its own weight. The final step is to sum up variable WX2 for all 22 days, and we call this variable VOL. The square root of VOL is basically my exponentially weighted moving average, but calculating this should not be an issue.
We repeat the process for the next 22 days, from day 2 to day 23, then the next 22 days from 3 to 24, and so on for each firm in my data since my data is panel data. Now at the end of each firm, we might now have 22 days but that it ok, we will go with whatever days left.
It is important to note that for each firm, we will not have any observation for the first 21 days for variable VOL, as we lost them in calculating the first VOL (the first summation is from 1 to 22 so the first observation takes place in day 22).
For simplicity, here is an example of a 10 days overlapping-rolling window:
So the goal here is to get a time series of VOL from X2, X2 in the above example. My first window is the blue window from day 1 to day 10. I first calculate the weights for the 10 days window stating from the most recent day which is day 10 since my dates are sorted in a ascending order. Given that Lambda ʎ = 0.95, the weight for day 10 is as we mentioned esrlier
W10 = (1-0.95) 0.950 = 0.05.
W9 = (1-0.95)*0.951 = 0.0475
.
.
W1 = (1-0.95)*0.959 = 0.031512
Then we multiply each weight by its X2 under a new variable X2*W1. Summing this colume from X2*W1 at day 1 to X2*W1 at day 10 gives us our first VOL observation at day 10. As we can see, in order to get VOL10 we lost the first 10 observations. We do the same process to get VOL11 by taking a window from day 2 to day 11 ( brown window) and VOL12 from day 3 to day 12 ( green window). We keep doing the same process until the last possible observation for each firm. This rolling window must be done using a firm id to ensure that there is no overlapping between firms. Once I have a time series of VOL, I then take the square root it and gives me the exponentially weighted moving average of X2.
I have used Mr. Cox's code for a non-overlapping rolling regression
Code:
rangestat (reg)
Code:
bys id: asrol res_sqr, window( Timestamp 22) stat(mean) gen(mean)
This is very important to me and any help is highly appreciated
Thank you
Comment