Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rangestat usage

    Hi,
    I have a question on how (and if) we could use rangestat for the following specific case. I want that: for the current observation (indexed by id & yearmonth) a value is calculated using previous 60 months returns such that: value = sumproduct of two vectors, vector a is the previous 60 returns (variable ret below)) and vector b is the (vectors of numbers such as 1/60,4/60...3/60...10/60....having 60 such numbers.). I guess one would have to write the mata subroutine for the sumproduct first and then call rangestat. I am just not sure how to proceed. Any hlep, as usual, is greatly appreciated.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double permno float fmy double ret
    10000 321  -.24242424964904785
    10000 322   .05999999865889549
    10000 323  -.37735849618911743
    10000 324  -.21212121844291687
    10000 325                    0
    10000 326  -.38461539149284363
    10000 327               -.0625
    10000 328  -.06666667014360428
    10000 329                    .
    10001 312                    .
    10001 313  .020408162847161293
    10001 314  .025200003758072853
    10001 315  .009900989942252636
    10001 316 -.009803921915590763
    10001 317 -.013069307431578636
    10001 318 -.010204081423580647
    10001 319   .07216494530439377
    10001 320 -.003076923545449972
    10001 321   .03921568766236305
    10001 322  .056603774428367615
    10001 323  .014999999664723873
    10001 324   -.0357142873108387
    10001 325  -.07407407462596893
    10001 326   .03680000081658363
    10001 327  -.03921568766236305
    10001 328   -.0714285746216774
    10001 329  .051428571343421936
    10001 330  .021276595070958138
    10001 331    .0833333358168602
    10001 332  -.02230769209563732
    
    end
    format %tm fmy

  • #2
    Thanks for the data example but I don't understand what you want here. Reduce the window to say 5 periods, pick any fmy observation you want and show exactly how you would calculate the desired result for that observation. Note that if you want to calculate the product of returns over a window of time, this is typically done as a sum of logs.

    Comment


    • #3
      Thanks Robert for the quick reply. I'll follow your suggestion reduce window size to 5 and try to be more clear.

      Let us say, we have the following data
      permno fmy ret
      10001 1994m1 -0.0790852
      10001 1994m2 0.0242261
      10001 1994m3 0.0415091
      10001 1994m4 -0.1544052
      10001 1994m5 0.0584364
      10001 1994m7 0.0418305
      10001 1994m8 -0.0696244
      10001 1994m9 0.0566967
      10001 1994m10 -0.068574
      Let us say we are talking about the following observation:
      10001 1994m8 -0.0696244

      Let us say the window size is 6.

      Now, I would like to do the following:

      I want these 6 previous returns to be divided into 2 groups (positive and negative). Thus, here we have 2 negative returns and 4 positive returns. Then I need to sort within these groups and give ranks.

      I give the positive returns ranks( 4,3,2,1) 4 being the most positive and 1 being the least positive. Similarly for negative returns we will have two ranks. (Generally, for a window size of w if there are m negative returns then there are n = w - m positive returns)

      The window with ranks now looks like:
      10001 1994m1 -0.0790852 -1
      10001 1994m2 0.0242261 1
      10001 1994m3 0.0415091 2
      10001 1994m4 -0.1544052 -2
      10001 1994m5 0.0584364 4
      10001 1994m7 0.0418305 3
      I want that for each current observation, the rank-weighted sum ( rank X return) is calculated on rolling basis: In this case It would be 0.854377 ( = -0.079 * -1 + 0.024*1 + ..0.041*3)
      10001 1994m8 -0.0696244 0.854377

      Comment


      • #4
        I would use rangerun (from SSC) to do this. The my_wsum program (defined below) sorts the observations in the window on the value of ret. Missing values are ranked highest but the product of ret with the weight will be missing for these and therefore will not affect the sum. The last part shows how to spot check the result for the last observation. I added an optimization where the upper bound is invalid (the value is higher than the interval's lower bound) for the first 5 observations per panel. This allows you to skip running the my_wsum program for observations where there would be an insufficient number of observations in the desired window.

        Code:
        clear all 
        
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input double permno float fmy double ret
        10000 321  -.24242424964904785
        10000 322   .05999999865889549
        10000 323  -.37735849618911743
        10000 324  -.21212121844291687
        10000 325                    0
        10000 326  -.38461539149284363
        10000 327               -.0625
        10000 328  -.06666667014360428
        10000 329                    .
        10001 312                    .
        10001 313  .020408162847161293
        10001 314  .025200003758072853
        10001 315  .009900989942252636
        10001 316 -.009803921915590763
        10001 317 -.013069307431578636
        10001 318 -.010204081423580647
        10001 319   .07216494530439377
        10001 320 -.003076923545449972
        10001 321   .03921568766236305
        10001 322  .056603774428367615
        10001 323  .014999999664723873
        10001 324   -.0357142873108387
        10001 325  -.07407407462596893
        10001 326   .03680000081658363
        10001 327  -.03921568766236305
        10001 328   -.0714285746216774
        10001 329  .051428571343421936
        10001 330  .021276595070958138
        10001 331    .0833333358168602
        10001 332  -.02230769209563732
        end
        format %tm fmy
        
        program my_wsum
          sort ret
          count if ret < 0
          gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
          gen double wx = sum(ret * w)
          drop w
        end
        
        bysort permno (fmy): gen high = cond(_n > 5, fmy - 1, -999)
        rangerun my_wsum, interval(fmy -6 high) by(permno)
        
        
        * spot check for last obs
        preserve
        keep if inrange(fmy, fmy[_N]-6,fmy[_N]-1) & permno == permno[_N]
        sort ret
        count if ret < 0
        gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
        gen double x = w * ret
        list
        sum x, meanonly
        dis r(sum)
        restore
        list in l
        and the results from the spot check:
        Code:
        . list
        
             +------------------------------------------------------------------+
             | permno      fmy          ret   high          wx    w           x |
             |------------------------------------------------------------------|
          1. |  10001   1987m5   -.07142857    327   .59477921   -2   .14285715 |
          2. |  10001   1987m4   -.03921569    326    .6165246   -1   .03921569 |
          3. |  10001   1987m7     .0212766    329   .76438483    1    .0212766 |
          4. |  10001   1987m3        .0368    325   .55997042    2       .0736 |
          5. |  10001   1987m6    .05142857    328   .71332769    3   .15428571 |
             |------------------------------------------------------------------|
          6. |  10001   1987m8    .08333334    330   .65345737    4   .33333334 |
             +------------------------------------------------------------------+
        
        . sum x, meanonly
        
        . dis r(sum)
        .76456849
        
        . restore
        
        . list in l
        
             +-------------------------------------------------+
             | permno      fmy          ret   high          wx |
             |-------------------------------------------------|
         30. |  10001   1987m9   -.02230769    331   .76456849 |
             +-------------------------------------------------+

        Comment


        • #5
          Thanks Robert, I have incorporated the code and running it for 3 million plus observations on a window size of 60. Wonder how long it would take. But thank you very much and I will update you of the results and performance.

          Comment


          • #6
            I don't know the exact structure of your dataset but the following example with 3 million observations runs in less than 3 minutes on my computer:
            Code:
            clear all 
            set obs 10000
            gen long permno = _n
            expand 300
            bysort permno: gen fmy = ym(1990,1) + _n
            gen double ret = runiform(-1,1)
            format %tm fmy
            
            program my_wsum
              sort ret
              count if ret < 0
              gen w = cond(ret < 0, _n - 1 - r(N), _n - r(N))
              gen double wx = sum(ret * w)
              drop w
            end
            
            rangerun my_wsum, interval(fmy -60 -1) by(permno)
            and the timing report:
            Code:
            . rangerun my_wsum, interval(fmy -60 -1) by(permno)
              (using rangestat version 1.1.1)
            r; t=164.94 11:37:34

            Comment


            • #7
              I figured, I could add any amount of stata code under the program and so I did add some more code. But again, a matter of minutes literally.
              Rangestat and Rangerun are absolutely wonderful. Thanks for this. It is a pity that a vast majority of us finance folks are unaware of how easy and wonderful there lives will become if they chuck their old ways dealing with well - anything rolling. The benefits are too great to be ignored.

              Comment


              • #8
                Great that this worked for you and thanks for the feedback. By all means, spread the word around to the finance community.

                Comment

                Working...
                X