Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to compute rolling median absolute deviation

    I am trying to compute the rolling median absolute deviation of a variable x for each group id, using 20 observations. I managed to compute the median for the rolling observations using rangestat. I then tried to use tsegen to use the egen's mad().

    Create data to play with:
    Code:
    clear
    cls
    set obs 100
    gen id = _n
    expand 100
    gen x = rnormal()
    bys id: gen time = _n
    Compute rolling median:
    Code:
    rangestat (median)  x, by(id) interval(time -20 0)  // this works
    Compute rolling median absolute deviation:
    Code:
    xtset id time
    tsegen mad_x = mad(L(0/20).x) // this does not work
    the last command results in an error. I fail to see how to pass the "use 20 observations" info to `mad'. Any suggestions?

    (Disclaimer: Cross posted on SO: https://stackoverflow.com/questions/...lute-deviation)
    Last edited by Jannic Cutura; 22 Oct 2019, 13:54.

  • #2
    This works but is not very elegant:
    Code:
     
     /* Add lagged values */  generate L1x = abs(L.x- x_median ) generate L2x = abs(L2.x- x_median ) . . . generate L20x = abs(L20.x- x_median )  /* Compute median of those /*  egen mad = rowmedian(L1x L2x ... L20x)

    Comment


    • #3
      The tsegen (from SSC) help file indicates in the Syntax section that you can use any egen function that expects a varlist. The help file for the egen command indicates that the mad() function expects an expression (i.e. a variable) so you can't use it with tsegen.

      Your solution in #2 can be restated as:
      Code:
      forvalues i=1/20 {
          generate double L`i'x = abs(L`i'.x-x_median)
      }
      egen mad = rowmedian(L1x-L20x)
      but this will not yield the correct results if you use x_median as generated in #1 since that median was computed using a window that includes 21 periods.

      I'll assume that you do not want to include the current observation so the complete and correct solution would be:
      Code:
      xtset id time
      rangestat (median) x (count) x, by(id) interval(time -20 -1)
      
      forvalues i=1/20 {
          generate double L`i'x = abs(L`i'.x-x_median)
      }
      egen mad = rowmedian(L1x-L20x)
      Note that I added a (count) in the rangestat call to show the sample for each measure.

      While rangestat does not support the egen mad() function, you could use use it via rangerun (from SSC):
      Code:
      program do1obs
          egen rrmad = mad(x)
      end
      rangerun do1obs, by(id) interval(time -20 -1)
      This is a good way to check the previous solution but since rangerun is slower than rangestat, I would stick with the previous solution. Finally, since rangestat is extensible, you could also create a Mata function to calculate the measure and do it all in rangestat. There's no median() function in Mata but you can copy the code from the rangestat ado:

      Code:
      xtset id time
      
      mata:
      
      /* ---- include a copy of the rs_median() function from rangestat.ado here --------- */
      
      real rowvector rsmad(real matrix X) {
      
          return(rs_median(abs(X :- rs_median(X))))
      
      }
      
      end
      
      rangestat (rsmad) x (count) x,  by(id) interval(time -20 -1)

      Comment

      Working...
      X