Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Faster/Efficient way of doing HP filter

    Dear Statalisters,

    I have a panel data with 1500 id and 400 time. I'm trying to run Hodrick-Prescott filter for the time series for each id. The code that I'm currently running is:

    Code:
    bys id (t): hprescott oldvar, stub(newvar) smooth(6.25)
    This code works fine when I test it with small number of id, like 5. It just creates 2 new variables (trend and epsilon), for each id. However, when I run this with much higher number of id, this code takes forever to compute.

    Is there a more efficient way of applying hpfilter on time series when my dataset is panel data? Ideally, I would just want 2 variables at the end, for trend and residuals of the trend for all id and all time.

    I have seen another way of doing HP filter, through the command

    Code:
    tsfilter hp newvar = oldvar, smooth()
    but this did not allow me to use by() command.

    Regards,
    Hee Sung

  • #2
    I know nothing of hprescott (from SSC) or tsfilter but you can use runby (from SSC) to run any command that does not support the by prefix.

    Note that at first glance, hprescott seems to support by: but it creates new variables for each panel. Since you have 1500 panels, this will cause the data in memory to swell and that probably explains why it slows down.

    Here's a quick example of how to perform the same task using both methods. With 1500 panels, each with 400 time periods, it takes about 25 seconds for hprescott and 95 seconds for tsfilter to do their thing:

    Code:
    * create demonstration dataset
    clear all
    set seed 44324
    set obs 150
    gen long id = _n
    expand 400
    bysort id: gen t = _n
    tsset id t
    gen oldvar = runiform()
    
    program hprescott_method
        tsset t
        hprescott oldvar, stub(newvar) smooth(6.25)
    end
    
    program tsfilter_method
        tsset t
        tsfilter hp newvar2 = oldvar, smooth(6.25)
    end
    
    runby hprescott_method, by(id) status
    runby tsfilter_method, by(id) status
    You can spot check results for any panel, here's code to check the second one:
    Code:
    . hprescott oldvar if id == 2, stub(check) smooth(6.25)
    
    . tsfilter hp check2 = oldvar if id == 2, smooth(6.25)
    
    . 
    . list if id == 2 & t < 5
    
           +---------------------------------------------------------------------------------------------+
           | id   t     oldvar   newvar~r_1   newva~m_1     newvar2   check_~r_1   check~m_1      check2 |
           |---------------------------------------------------------------------------------------------|
      401. |  2   1   .7739904   -.00500584   .77899623   -.0050058   -.00500584   .77899623   -.0050058 |
      402. |  2   2   .7997273    .18300588   .61672138    .1830059    .18300588   .61672138    .1830059 |
      403. |  2   3   .5745919    .12094628    .4536456    .1209463    .12094628    .4536456    .1209463 |
      404. |  2   4   .0282791   -.28996977   .31824889   -.2899698   -.28996977   .31824889   -.2899698 |
           +---------------------------------------------------------------------------------------------+
    
    .

    Comment


    • #3
      I have not tested to see if this is more efficient, but I'm currently computing hpfilter using the following method that helps me to see how much longer the code will take:

      Code:
      gen newyvar =.
      foreach num of numlist 1/1500{
      display `num'
      hprescott yvar if id==`num', stub(hpyvar`num') smooth(6.25)
      replace newyvar = hpyvar`num'_yvar_1 if id==`num'
      drop hpyvar`num'_yvar_1 hpyvar`num'_yvar_s~1
      I'm really interested in the residual part, so I'm only saving the residual into newyvar. However, while this method allows me to see how many more ids Stata needs to compute, I'm not positive if this is the best way to go about this. It's still taking a long time to compute.

      Comment


      • #4
        Thank you Robert Picard!

        You posted your answer while I was typing my second attempt. Your method indeed works a lot faster than mine does, and much more elegant too! I wonder why my second attempt still took much longer than your command with runby, but I'm happy to see the results nonetheless.

        Comment

        Working...
        X