Rangestat usage

Vats Prith

Join Date: Jul 2017
Posts: 30

09 Dec 2017, 08:21

Hi,
I have a question on how (and if) we could use rangestat for the following specific case. I want that: for the current observation (indexed by id & yearmonth) a value is calculated using previous 60 months returns such that: value = sumproduct of two vectors, vector a is the previous 60 returns (variable ret below)) and vector b is the (vectors of numbers such as 1/60,4/60...3/60...10/60....having 60 such numbers.). I guess one would have to write the mata subroutine for the sumproduct first and then call rangestat. I am just not sure how to proceed. Any hlep, as usual, is greatly appreciated.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double permno float fmy double ret
10000 321  -.24242424964904785
10000 322   .05999999865889549
10000 323  -.37735849618911743
10000 324  -.21212121844291687
10000 325                    0
10000 326  -.38461539149284363
10000 327               -.0625
10000 328  -.06666667014360428
10000 329                    .
10001 312                    .
10001 313  .020408162847161293
10001 314  .025200003758072853
10001 315  .009900989942252636
10001 316 -.009803921915590763
10001 317 -.013069307431578636
10001 318 -.010204081423580647
10001 319   .07216494530439377
10001 320 -.003076923545449972
10001 321   .03921568766236305
10001 322  .056603774428367615
10001 323  .014999999664723873
10001 324   -.0357142873108387
10001 325  -.07407407462596893
10001 326   .03680000081658363
10001 327  -.03921568766236305
10001 328   -.0714285746216774
10001 329  .051428571343421936
10001 330  .021276595070958138
10001 331    .0833333358168602
10001 332  -.02230769209563732

end
format %tm fmy

Tags: None

Robert Picard

Join Date: Mar 2014

Posts: 1536
#2

09 Dec 2017, 09:31

Thanks for the data example but I don't understand what you want here. Reduce the window to say 5 periods, pick any fmy observation you want and show exactly how you would calculate the desired result for that observation. Note that if you want to calculate the product of returns over a window of time, this is typically done as a sum of logs.
Comment

Vats Prith

Join Date: Jul 2017
Posts: 30

09 Dec 2017, 10:26

Thanks Robert for the quick reply. I'll follow your suggestion reduce window size to 5 and try to be more clear.

Let us say, we have the following data

permno	fmy	ret
10001	1994m1	-0.0790852
10001	1994m2	0.0242261
10001	1994m3	0.0415091
10001	1994m4	-0.1544052
10001	1994m5	0.0584364
10001	1994m7	0.0418305
10001	1994m8	-0.0696244
10001	1994m9	0.0566967
10001	1994m10	-0.068574

Let us say we are talking about the following observation:

10001

1994m8

-0.0696244

Let us say the window size is 6.

Now, I would like to do the following:

I want these 6 previous returns to be divided into 2 groups (positive and negative). Thus, here we have 2 negative returns and 4 positive returns. Then I need to sort within these groups and give ranks.

I give the positive returns ranks( 4,3,2,1) 4 being the most positive and 1 being the least positive. Similarly for negative returns we will have two ranks. (Generally, for a window size of w if there are m negative returns then there are n = w - m positive returns)

The window with ranks now looks like:

10001	1994m1	-0.0790852	-1
10001	1994m2	0.0242261	1
10001	1994m3	0.0415091	2
10001	1994m4	-0.1544052	-2
10001	1994m5	0.0584364	4
10001	1994m7	0.0418305	3

I want that for each current observation, the rank-weighted sum ( rank X return) is calculated on rolling basis: In this case It would be 0.854377 ( = -0.079 * -1 + 0.024*1 + ..0.041*3)

10001

1994m8

-0.0696244

0.854377

Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

09 Dec 2017, 14:28

I would use rangerun (from SSC) to do this. The my_wsum program (defined below) sorts the observations in the window on the value of ret. Missing values are ranked highest but the product of ret with the weight will be missing for these and therefore will not affect the sum. The last part shows how to spot check the result for the last observation. I added an optimization where the upper bound is invalid (the value is higher than the interval's lower bound) for the first 5 observations per panel. This allows you to skip running the my_wsum program for observations where there would be an insufficient number of observations in the desired window.

Code:

clear all 

* Example generated by -dataex-. To install: ssc install dataex
clear
input double permno float fmy double ret
10000 321  -.24242424964904785
10000 322   .05999999865889549
10000 323  -.37735849618911743
10000 324  -.21212121844291687
10000 325                    0
10000 326  -.38461539149284363
10000 327               -.0625
10000 328  -.06666667014360428
10000 329                    .
10001 312                    .
10001 313  .020408162847161293
10001 314  .025200003758072853
10001 315  .009900989942252636
10001 316 -.009803921915590763
10001 317 -.013069307431578636
10001 318 -.010204081423580647
10001 319   .07216494530439377
10001 320 -.003076923545449972
10001 321   .03921568766236305
10001 322  .056603774428367615
10001 323  .014999999664723873
10001 324   -.0357142873108387
10001 325  -.07407407462596893
10001 326   .03680000081658363
10001 327  -.03921568766236305
10001 328   -.0714285746216774
10001 329  .051428571343421936
10001 330  .021276595070958138
10001 331    .0833333358168602
10001 332  -.02230769209563732
end
format %tm fmy

program my_wsum
  sort ret
  count if ret < 0
  gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
  gen double wx = sum(ret * w)
  drop w
end

bysort permno (fmy): gen high = cond(_n > 5, fmy - 1, -999)
rangerun my_wsum, interval(fmy -6 high) by(permno)


* spot check for last obs
preserve
keep if inrange(fmy, fmy[_N]-6,fmy[_N]-1) & permno == permno[_N]
sort ret
count if ret < 0
gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
gen double x = w * ret
list
sum x, meanonly
dis r(sum)
restore
list in l

and the results from the spot check:

Code:

. list

     +------------------------------------------------------------------+
     | permno      fmy          ret   high          wx    w           x |
     |------------------------------------------------------------------|
  1. |  10001   1987m5   -.07142857    327   .59477921   -2   .14285715 |
  2. |  10001   1987m4   -.03921569    326    .6165246   -1   .03921569 |
  3. |  10001   1987m7     .0212766    329   .76438483    1    .0212766 |
  4. |  10001   1987m3        .0368    325   .55997042    2       .0736 |
  5. |  10001   1987m6    .05142857    328   .71332769    3   .15428571 |
     |------------------------------------------------------------------|
  6. |  10001   1987m8    .08333334    330   .65345737    4   .33333334 |
     +------------------------------------------------------------------+

. sum x, meanonly

. dis r(sum)
.76456849

. restore

. list in l

     +-------------------------------------------------+
     | permno      fmy          ret   high          wx |
     |-------------------------------------------------|
 30. |  10001   1987m9   -.02230769    331   .76456849 |
     +-------------------------------------------------+

Comment

Vats Prith

Join Date: Jul 2017

Posts: 30
#5

09 Dec 2017, 15:52

Thanks Robert, I have incorporated the code and running it for 3 million plus observations on a window size of 60. Wonder how long it would take. But thank you very much and I will update you of the results and performance.
1 like
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

10 Dec 2017, 09:42

I don't know the exact structure of your dataset but the following example with 3 million observations runs in less than 3 minutes on my computer:

Code:

clear all 
set obs 10000
gen long permno = _n
expand 300
bysort permno: gen fmy = ym(1990,1) + _n
gen double ret = runiform(-1,1)
format %tm fmy

program my_wsum
  sort ret
  count if ret < 0
  gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
  gen double wx = sum(ret * w)
  drop w
end

rangerun my_wsum, interval(fmy -60 -1) by(permno)

and the timing report:

Code:

. rangerun my_wsum, interval(fmy -60 -1) by(permno)
  (using rangestat version 1.1.1)
r; t=164.94 11:37:34

Comment

Vats Prith

Join Date: Jul 2017

Posts: 30
#7

10 Dec 2017, 22:40

I figured, I could add any amount of stata code under the program and so I did add some more code. But again, a matter of minutes literally.
Rangestat and Rangerun are absolutely wonderful. Thanks for this. It is a pity that a vast majority of us finance folks are unaware of how easy and wonderful there lives will become if they chuck their old ways dealing with well - anything rolling. The benefits are too great to be ignored.
1 like
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#8

11 Dec 2017, 12:35

Great that this worked for you and thanks for the feedback. By all means, spread the word around to the finance community.
Comment

Announcement

Rangestat usage

Comment

Comment

Comment

Comment

Comment

Comment

Comment