Collect rolling correlations using statsby

Christopher Koedding

Join Date: Dec 2016

Posts: 14
#1

Collect rolling correlations using statsby

15 Dec 2016, 08:21

Dear all,

I have an unbalanced panel dataset of stock and market returns where permno is the panel variable and date is the time variable. The data has been tsset. The sample of interest covers the period from January 1996 through October 2012.

Now I'm trying to estimate 60-months rolling correlations between the stock and market returns. Rolling over 60 months is done by looping. I would like to use the -statsby- command to collect correlation coefficients for the different stocks (permno).

So far, my code looks like this (later I will append all files):

Code:

* 3. Correlations // Daily sample: months from 01Jan1996-31Oct2012 local k = 1996*12+1 // k=Jan1996 local l = 2012*12+10 // l=Oct2012 forvalues i=`k'(1)`l'{ disp `i' quietly use date permno stock_ret market_ret if year(date)*12+month(date)<=`i'&year(date)*12+month(date)>`i'-60 using "C:\...", clear if (r(N)>=200){ quietly statsby rho_s_m = r(rho), by(permno) saving(C:/.../corr`i', replace): corr stock_ret market_ret } }

However, tow issues arised when running the code.
1. It takes a lot of time. After 30 minutes, the very first file "corr23953" hasn't been finished, yet.

2. When choosing the "noisily" option, every now and then there is the message

"no observations
captured error running (correlate stock_ret market_ret), posting missing values"

popping up.

My questions are: What did I do wrong? Is there a more efficient way to collect rolling correlations in this case?

Help is highly appreciated.

Best,
Christopher
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#2

15 Dec 2016, 09:48

First, if you want rolling statistics, instead of building a loop around -statsby- I would use the -rolling- command.

Next, you have made your life more complicated than it needs be with your way of handling dates. Use Stata monthly dates instead.

Next, your -if (r(N) >= 200) {- is incorrect. In fact, I'm amazed you got any output from it at all. It should generate a syntax error because r(N) is not defined at that point in the code.

So, here's how I would approach this:

Code:

use my_data_file, clear gen int mdate = mofd(date) format mdate %tm keep if inrange(mdate, tm(1996m1), tm(2012m10) xtset permno mdate rolling r(rho), window(60) reject(r(N) < 200)) saving(correlations, replace): corr stock_ret market_ret

Notes: Not tested. Beware of typos, other minor errors. This will produce a single file containing all of your correlations, and there will be variables indicating which permno and which window they apply to.

Note also that if your data set is large this is going to take a long time. So be patient.
Comment
Christopher Koedding

Join Date: Dec 2016

Posts: 14
#3

15 Dec 2016, 10:21

Thank you very much indeed, Clyde!

However, the problem is that when I -xtset- the data, Stata shows the error message:
"repeated time values within panel"

Note that the original dataset is daily, so I have observations of "stock_ret" and "market_ret" for every day. My goal, however, is to obtain monthly rolling correlations.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

15 Dec 2016, 10:28

OK, but what does that mean? Correlations between monthly means? Mean correlations for each month of daily values? etc.
Comment
Christopher Koedding

Join Date: Dec 2016

Posts: 14
#5

15 Dec 2016, 10:37

Sorry, should have stated this more clearly from the beginning. Indeed, I'm looking for mean correlations for each month of daily values.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#6

15 Dec 2016, 10:52

This seems to get less clear as we proceed. How do you reconcile "mean correlations of each month of daily values" with a 60 month window? Do you mean you want to do a correlation for each month, and then average those monthly correlations over 60 month windows? I worry about that approach if only because, in general, averaging correlation coefficients is rarely a useful or meaningful thing to do.

Or perhaps the monthly aspect of the data is more about framing? That is, do you want a 60 month window that correlates the daily observations, and perhaps with the windows separated by a month instead of by a day?

Nick's suggestion in #4 of reducing the data set to monthly means and then correlating the monthly means over a 60 month window makes sense to me. But I gather that isn't what you have in mind based on what you say in #5.

Last edited by Clyde Schechter; 15 Dec 2016, 10:54.
Comment
Christopher Koedding

Join Date: Dec 2016

Posts: 14
#7

15 Dec 2016, 13:42

Sorry for the confusion. Of course the window length has to be specified in days, not months. What confused me here was that in financial markets a month typically consists of 20 or 21 trading days, but that can obviously vary due to weekends or holidays. My first guess to “circumvent” this was specifying the window length in months … Never mind.
Let’s assume, on average, 250 trading days per year, i.e. 1250 trading days per five years (60 months). I want to require a minimum of 200 valid observations. If the number of observations is less than that, the correlation value should be set missing. Having -tsset- the (daily) data, I would have something like:

Code:

rolling r(rho), window(1250) reject(r(N) < 200)) saving(correlations, replace): corr stock_ret market_ret

More specifically, the obtained correlations then serve as an input for a “market beta” variable that is averaged on a monthly basis.
The code seems to work perfectly well. However, it is very slow as you mentioned in #2.

Is there a way to combine the -rangestat- command with Mata to speed things up? Something like:

Code:

mata: mata clear mata set matastrict on real rowvector mycorr(real matrix X) { real matrix R R = correlation(X) return(rows(X), R[2,1]) } end rangestat (mycorr) stock_ret market_ret, interval(date -1250 0) by(permno) casewise rename (myreg1 myreg2) (nobs rho_s_m) // Reject results if window < 1250 or number of obs < 200 isid permno date by permno: replace rho_s_m = . if _n < 1250 | nobs < 200
Comment
Christopher Koedding

Join Date: Dec 2016

Posts: 14
#8

15 Dec 2016, 15:11

Edit -rangestat- command:

Code:

... rename (mycorr1 mycorr2) (nobs rho_s_m) ...
Comment

Announcement

Collect rolling correlations using statsby

Comment

Comment

Comment

Comment

Comment

Comment

Comment