Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rolling Window / Serial Covariance- Roll's (1984) bid-ask spread estimator (Market Liquidity)

    - this is the a general description of a measure; if this does not comply with Statalist rules, feel free to delete it. My question is in the second post of this thread. The question can be understood without reading the first post, hopefully! -

    I am going to go quickly highlight the main concept of the Roll bid-ask spread estimator, as this thread might be useful for other finance students. This estimator of the bid-ask spread shall reflect the liquidity of a security, given that some assumptions hold (see: Roll, The Journal of Finance, 1984). It can be used for low frequency data, basing on last prices, or on high frequency, basing on intraday data.

    This liquidity measure can be estimated as follows:

    spread estimate = 2*sqrt(- covariance (price_change_t+1, price_change_t)) // in the case the spread should be expressed as percent, replace 2* with 200*

    The Roll measure can be calculated for arbitrary intervals. For daily estimates, a full trading month of data, i.e. 21 trading days, are recommended and the daily estimator should be calculated on days with at least one transaction. To be well defined, four transactions should take place within that 21-day period (more relevant for corporate bonds, than for stocks). The serial covariance is then calculated based on the price changes in that 21-day period. For daily estimates, a rolling window of 21 days is useful. For quarterly estimates, the median of the daily estimates may be used.
    Last edited by Felix Schrock; 29 Aug 2018, 11:02.

  • #2
    I have problems to calculate the serial correlation of the price changes on a rolling window. I would like to calculate the spread estimator with a daily frequency and a monthly frequency. I will try to explain the rolling window "problem".

    I have a panel data set, with different bonds (isin). For each bond, I have a time series of price changes (px_change). Of the full formula for the estimator,

    spread estimate = 2*sqrt(- covariance (price_change_t+1, price_change_t))

    I would like to calculate covariance part: (covariance (price_change_t+1; price_change_t) for a rolling window of 21 observations. I have tried this, which does not work:

    Code:
    rangestat (corr) L.px_change px_change if include, by(isin) interval(obs_count 0 20)
    Maybe something in similiar with (corr) px_change_N px_change_n could be used?

    By rolling window I mean:

    - the first window starts at observation 1 and ends at observation 21
    - the second window starts at observation 2 and ends at observation 22
    - ...
    - the last window starts at observation N-21 and ends at observation N

    I assume, that the first covariance result should appear in the line of observation 21.

    Code:
    input float monthly int date double px_change long isin float obs_count
    633 19297    .005740245403982524 1   1
    634 19298    .001604174230207426 1   2
    634 19299   .0038185634446308548 1   3
    634 19302   .0025192301232743034 1   4
    634 19303  -.0029568451383769553 1   5
    634 19304    .010719896497551026 1   6
    634 19305  -.0003906672097217705 1   7
    634 19306    .003118243805088974 1   8
    634 19309   -.001914867161271677 1   9
    634 19310   .0021012590943822475 1  10
    634 19311  -.0016907431810835138 1  11
    634 19312  -.0032958913435116232 1  12
    634 19313 -.00019990671020191336 1  13
    634 19316   -.002707611303652383 1  14
    634 19317    -.00823677813327533 1  15
    634 19318   -.003242897212793062 1  16
    634 19319   -.001867563547863741 1  17
    634 19320    -.00328493417432156 1  18
    634 19323    .005827040525962547 1  19
    634 19324  -.0006249313842229998 1  20
    634 19325    .008771410946518026 1  21
    634 19326  -.0006115080794457333 1  22
    634 19327  -.0030258834573862084 1  23
    635 19330  -.0035815474639113018 1  24
    635 19331    .004750373364158749 1  25
    635 19332    .006911320120927171 1  26
    635 19333    .004728822464825676 1  27
    635 19334   -.003735369801610383 1  28
    635 19337  .00003332777870359014 1  29
    635 19338  -.0023578617609811476 1  30
    635 19339 -.00009186494183275302 1  31
    635 19340  -.0005595924162700015 1  32
    635 19341    .002206195733016973 1  33
    635 19344   -.002860073211203511 1  34
    635 19345   -.005301712603692804 1  35
    635 19346  -.0037746952501050697 1  36
    635 19347    .007240445228310224 1  37
    635 19348    .003912566291607661 1  38
    635 19351  -.0013352694740707074 1  39
    635 19352   .0002256279979275736 1  40
    635 19353   .0006433124744136932 1  41
    635 19354    .015312682641730009 1  42
    635 19355   .0002960428933258887 1  43
    635 19358  -.0014222295297599624 1  44
    636 19359   .0006339170309631273 1  45
    636 19360   -.022008490752320125 1  46
    636 19361   -.008858491280316965 1  47
    636 19362   -.010779520608406295 1  48
    636 19365    .006229300018876658 1  49
    636 19366    .004996930213520691 1  50
    636 19367   .0019005922381169673 1  51
    636 19368   -.011271828051692797 1  52
    636 19369    .003588834356879144 1  53
    636 19372    .004847657250149337 1  54
    636 19373    .005894443595101009 1  55
    636 19374   .0021109159686571195 1  56
    636 19375  -.0052071923290811515 1  57
    636 19376    .012959073045128076 1  58
    636 19379   -.005108951954111327 1  59
    636 19380 -.00047906003378635735 1  60
    636 19381    .001706943813800203 1  61
    636 19382  -.0070343912901141455 1  62
    636 19383   -.006348750116238687 1  63
    636 19386   -.004262378764675895 1  64
    636 19387   .0009484018147797248 1  65
    636 19388    .000452411438326908 1  66
    636 19389   .0050766618601912825 1  67
    637 19390  -.0016129306100273156 1  68
    637 19393    .009081014896945768 1  69
    637 19394   -.006572516768344072 1  70
    637 19395   .0009330257175815925 1  71
    637 19396   .0012795959527481336 1  72
    637 19397  .00037238589334526883 1  73
    637 19400   -.001615891574521267 1  74
    637 19401   -.003330226252012547 1  75
    637 19402   -.006189581438057497 1  76
    637 19403   .0029258527320791413 1  77
    637 19404  -.0005459306838634231 1  78
    637 19407    .003371256411788268 1  79
    637 19408 -.00026369064833867834 1  80
    637 19409   -.005743165632896949 1  81
    637 19410    .010551447935921169 1  82
    637 19411  -.0006012414365436066 1  83
    637 19414   .0002711450795641044 1  84
    637 19415    .013570520965692535 1  85
    637 19416    .000927690301875474 1  86
    637 19417  -.0005260389268806091 1  87
    638 19418    .007376775271512076 1  88
    638 19421   -.001683487722149913 1  89
    638 19422   -.005682006977903286 1  90
    638 19423  -.0017043176045982615 1  91
    638 19424    -.00331402939108895 1  92
    638 19425  -.0024769933499026136 1  93
    638 19428  -.0032575483371352133 1  94
    638 19429   .0053034269596500426 1  95
    638 19430  .00008400396498719036 1  96
    638 19431   .0002183919631757236 1  97
    638 19432    .001284872100639846 1  98
    638 19435    .007162566782128802 1  99
    638 19436    .015372444518466118 1 100
    end
    format %tm monthly
    format %td date
    label values isin isin
    label def isin 1 "XS0295018070", modify

    Comment


    • #3
      The -rangestat- command has two problems. The first is that it refers to a variable -include- which is not in your data example. After dropping -if include- from the command (you can leave it in if your real data has a variable called include), the next problem is that -rangestat- does not parse time-series operators like lag. But you can easily get around that by just creating a new variable with the lag:

      Code:
      xtset isin date
      gen lag_px_change = L.px_change
      
      rangestat (corr) lag_px_change px_change, by(isin) interval(obs_count 0 20)
      Also, this -rangestat- command is calculating the correlation. If what you want is the covariance, change (corr) to (cov)

      Comment


      • #4
        Thank you, Clyde. Your solution is neat and has shown me a new perspective, thanks.

        I have adjusted the code, since the date variable has no observations on weekends, which leads to many missing values.

        Code:
        by isin: gen obs_count=_n
        xtset isin obs_count
        gen lag_px_change = L.px_change
        rangestat (cov) lag_px_change px_change, by(isin) interval(obs_count 0 20)

        The variable "cov_nobs" after running rangestat (cov) has the value of 20 / 21 right from the beginning. Since I have a rolling window of 21 observations and the covariance (cov_x) should be "backward looking", I am not sure which is the first relevant value.

        Should I ignore the first 20 covariance results (cov_x) and start with the 21st value or does the first covariance oberservation represent to the 21st time step and i should ignore the last 20 observations?

        I am a little off track here.



        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float monthly int date double(px_last px_change) long isin float ret_day byte has_px_change float(obs_count lag_px_change) double(cov_nobs cov_x)
        633 19297 118.441    .005740245403982524 1   1 1   1             . 20   2.387367031398217e-07
        634 19298 118.631    .001604174230207426 1   2 1   2    .005740245 21 -1.8490954786618315e-07
        634 19299 119.084   .0038185634446308548 1   3 1   3   .0016041743 21  -3.578909940745957e-07
        634 19302 119.384   .0025192301232743034 1   4 1   4   .0038185634 21 -4.1399595151762437e-08
        634 19303 119.031  -.0029568451383769553 1   5 1   5     .00251923 21 -1.4293868781387583e-06
        634 19304 120.307    .010719896497551026 1   6 1   6   -.002956845 21   6.521651021332537e-07
        634 19305  120.26  -.0003906672097217705 1   7 1   7    .010719896 21   3.801685469926813e-06
        634 19306 120.635    .003118243805088974 1   8 1   8  -.0003906672 21   3.161955835863245e-06
        634 19309 120.404   -.001914867161271677 1   9 1   9    .003118244 21  3.1889023996842828e-06
        634 19310 120.657   .0021012590943822475 1  10 1  10  -.0019148672 21   3.442391676596748e-06
        634 19311 120.453  -.0016907431810835138 1  11 1  11    .002101259 21  3.6208879572793385e-06
        634 19312 120.056  -.0032958913435116232 1  12 1  12   -.001690743 21  3.7820455327479186e-06
        634 19313 120.032 -.00019990671020191336 1  13 1  13  -.0032958915 21   3.542798791959473e-06
        634 19316 119.707   -.002707611303652383 1  14 1  14  -.0001999067 21  3.2004493760073762e-06
        634 19317 118.721    -.00823677813327533 1  15 1  15   -.002707611 21    3.88701496226094e-06
        634 19318 118.336   -.003242897212793062 1  16 1  16   -.008236778 21   3.799903234327974e-06
        634 19319 118.115   -.001867563547863741 1  17 1  17   -.003242897 21  1.1580350009734722e-06
        634 19320 117.727    -.00328493417432156 1  18 1  18  -.0018675636 21  1.9312299168810933e-06
        634 19323 118.413    .005827040525962547 1  19 1  19   -.003284934 21  1.0950938585852247e-06
        634 19324 118.339  -.0006249313842229998 1  20 1  20     .00582704 21  2.1807481755075244e-06
        634 19325 119.377    .008771410946518026 1  21 1  21  -.0006249314 21  2.4827407739872257e-06
        634 19326 119.304  -.0006115080794457333 1  22 1  22    .008771411 21  3.0227340865162946e-06
        634 19327 118.943  -.0030258834573862084 1  23 1  23  -.0006115081 21  3.1854145427404966e-06
        635 19330 118.517  -.0035815474639113018 1  24 1  24   -.003025883 21   2.954435013195103e-06
        635 19331  119.08    .004750373364158749 1  25 1  25  -.0035815474 21  2.0732009322922623e-06
        end
        format %tm monthly
        format %td date
        label values isin isin
        label def isin 1 "XS0295018070", modify

        Comment


        • #5
          If by "backward looking" you mean that the relevant statistic for a given date is the covariance calculated from that date and the 20 preceding dates, then you should code it differently:

          Code:
          rangestat (cov) lag_px_change px_change, by(isin) interval(obs_count -20 0)
          Is that what you want?

          Comment


          • #6
            Yes, that's the solution I needed. Thank you for your support.

            Comment


            • #7
              Felix Schrock How would this work if I had transaction prices every minute, for one whole day?

              Comment

              Working...
              X