Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can we use only the filtering part of -sspace- ? - a trial toward panel Kalman Filtering

    Dear all,

    I'm wondering if we can use only the filtering part of calculation done in -sspace- with a fixed set of parameters.

    What I'm thinking is to estimate potential output for developing countries using the Kalman Filter with panel data. Potential output is assumed to be nonstationary and initial values need to be estimated by country (serving as a country-fixed effect), but structural parameters (e.g., Phillips Curve's coefficients) and the variances of shocks are assumed to be common across countries so that I can use more data observations than a single-country estimation. Issues on developing country research are a short sample period and high data volatility, so I would like to mitigate them by using panel estimation, which has limitations by its own like the strong assumption of common parameters.

    Toward the panel Kalman Filter, my idea is to use the filtering part of -sspace- to calculate likelihood given a fixed set of parameters by country, and then, to aggregate them up to form the likelihood in the panel setting, which can be used in estimation using -ml- or so.

    My version of Stata is 14.0 and in my understanding, -sspace- does not allow multiple panels. Of course, if there are any other ways (functions/ado files) that allow us to estimate a panel state space model, then there would be no need for all of this.

    I would very much appreciate any advice or comments on this.

    Best Regards,
    Futoshi

    --
    Futoshi Narita (Mr.)
    Economist
    Developing Markets Strategy Unit
    Strategy, Policy, and Review Department
    International Monetary Fund
    TEL: 202-623-7143
    Email: [email protected]

  • #2
    Dear all,

    I found a solution by myself, and share it here for record. The following blog entry exactly gives the answer to my question.

    http://blog.stata.com/2015/05/26/bay...ilt-in-models/

    The key is using "interate(0)" and specifying parameters at which the likelihood to be evaluated by initial(). For -sspace-, you need to understand the order of parameters to set up a vector of "initial" parameters. In the error-form syntax, the coefficients of variables in the state equations come first in order with the "_cons" term at the last and the coefficients of error terms follow. Subsequently, the coefficients in the observation equations come next in the order, in the same way (i.e., the coefficients on variables, the constant term, and the coefficients of error terms). This is the same order as you see in the estimation results table.

    A shortcoming of this approach is that required computation time is very long, which is what I am now suffering from, as stated in the blog entry above.

    Somehow, I found that burn-in MCMC sampling is faster than the following actual MCMC sampling, but I cannot figure out why.

    Any comments on computation time (either on -sspace- or -bayesmh evaluator- or both) would be highly appreciated.

    Best Regards,
    Futoshi

    Comment


    • #3
      Dear all,

      What I found is that the time needed to evaluate the likelihood using -sspace- is getting slower and slower as the loop goes. For example, at the first run, it takes 0.5 seconds. Then, the time needed is gradually going up to 5 seconds at its 84-th run. It eventually goes up to takes 20 seconds at its 208-th run. The pace of the increase seem to be linear (not exponential) to the number of running -sspace- and never stops going up. This roughly implies that the computation time would be proportional to the square of the number of the runs. If so, 15000 MCMC draws could result in 15000^2 seconds....

      This increasing running time for -sspace- has nothing to do with MCMC (or -bayesmh-) because I observe the same problem in a simple loop code. Here is a sample code that demonstrates the incremental running time of -sspace-.

      Code:
      sysuse sp500, clear
      
      gen datew = wofd(date)
      format datew %tw
      collapse close, by(datew)
      sort datew
      tsset datew
      gen double lnc = log(close)
      mat b0 = (1, 0, 0.5, 0.9, 0.5, 1, 1)
      
      forvalues i=1/100 {
          timer clear
          timer on 1
          mat b0[1,2] = -0.0001*`i'
      
          sspace ///
          (zp L.zp e.zp, state) ///
          (zt L.zt e.zt, state noconstant) ///
          (lnc zp zt, noconstant) ///
          , covstate(identity) from(b0) ///
          iterate(0)
      
          timer off 1
          timer list 1
          di as err "This is `i'-th run: log likelihood is " e(ll)
          di as err string((r(t1) - mod(r(t1), 60))/60,"%15.0f") " min " string(mod(r(t1), 60),"%15.1f") " sec "
      }
      When I replaced -sspace- by -regress- or other similar estimation codes, I didn't observe the same problem.

      Hope anyone can share any thoughts and possible solution about this computation time problem.

      Best Regards,
      Futoshi

      Comment


      • #4
        Futoshi,

        StataCorp is looking into why the execution time of sspace increases with repeated calls to the estimator. In the mean time we have found that if you call mata: mata clear at the top of your forvalues loop the execution time of sspace remains fairly constant.

        Comment


        • #5
          Dear Richard,

          Thank you so much for your reply! I really appreciate the fact that you and your colleagues are looking into this issue!

          I also further investigated this issue and reached the same conclusion that mata clear makes the execution time of sspace remain fairly constant.

          Using mata clear only partially solves my problem, because doing so will lose: (1) all information in mata memory, which prevents me from using bayesmh; and also lose (2) efficiency gain for storing in mata memory the functions that are called repeatedly.

          Just for your information, I inserted mata memory in the loop and found that there is a steady increase in the number of matrices (or scalars) by 380 in each loop in my sample code above, which might have something to do with the slowing down.

          In my view, the best solution could be making sure that all the matrices (or scalars) that sspace has finished using are dropped from the mata memory. I guess this issue might also be slowing down the calculation even when sspace is normally used to solve a state space model using maximum likelihood.

          I would very much appreciate it if you could possibly fix this problem soon.

          Again, thank you very much for taking the time on this issue!

          Best Regards,
          Futoshi
          Last edited by Futoshi Narita; 20 May 2016, 15:34.

          Comment

          Working...
          X