No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • TSTRANSFORM: new Stata module for time-series transformations of variables

    Dear Statalisters,

    I have developed a new Stata command that generates new variables based on time-series transformations. Of particular interest might be the computation of backward and forward means, and deviations from those means. Backward and forward mean deviations are also known as backward orthogonal and forward orthogonal deviations.

    For completeness, it can also mimic the familiar time-series operators L#., F#., D#., and S#., but it has the advantage that multiple transformations can be performed by typing only a single command line in your do-files. (It also circumvents a Stata bug with zeroth seasonal differences, S0., that I have described here.)

    The program can be installed by typing
    net from ""
    in Stata’s command window.

    A help file that documents the command syntax and the available options can be accessed by typing
    help tstransform

    I plan to add other transformations in future updates. Comments and suggestions for additional transformations are welcome.

  • #2
    The new month starts with a major update of the tstransform package. In particular, the package now includes separate functions that can be used in the standard way with the egen command:
    • demean(): mean deviations
    • bmean(): backward mean
    • bdemean(): backward mean deviations
    • fmean(): forward mean
    • fdemean(): forward mean deviations
    These functions also work with the user-written command tsegen by Robert Picard and Nick Cox when the argument is a single varname with time-series operators. For details on tsegen see:
    For all of these functions, the option rescale is available that multiplies the new variables with a scaling factor. Assuming that the original variables are independent and identically distributed, the scaling factor assures that the theoretical variance of the transformed variables equals that of the original variables. This has been proposed for example by Arellano and Bover (1995) in the context of forward orthogonal deviations. (See the help file of tstransform for more details.)

    Here is an example with forward mean deviations (also known as forward orthogonal deviations) of a variable n and its first time lag L.n:
    webuse abdata
    by id: egen n_FDM = fdemean(n), rescale
    tsegen Ln_FDM = fdemean(L.n), rescale by(id)
    While egen does not support time-series operators, tsegen does not support the by prefix. However, the undocumented by() option can be used with tsegen to circumvent this problem here.

    The identical results can also be obtained with the tstransform syntax:
    tstransform n L.n, fdemean rescale generate(n_FDM Ln_FDM)
    tstransform automatically computes all transformations within panel if a panel identifier (in this example the variable id) is specified with xtset or tsset. It also has a replace option.

    To install the package type:
    net install tstransform, from(
    and to update an existing installation to the new version type:
    adoupdate tstransform, update
    As always, comments and suggestions are welcome.

    • Arellano, M. and O. Bover (1995). Another look at the instrumental variable estimation of error-components models. Journal of Econometrics 68: 29-51.
    Last edited by Sebastian Kripfganz; 01 Jun 2015, 07:38.


    • #3
      This looks very useful and complements the tsegen command of Robert Picard and myself.

      However, note that egen does not have an undocumented by() option. Rather, egen will accept a by() option under a wildcard and it's then a matter of whether the egen function being called does or does not allow that. Some do, some don't.

      Usually this detail will be irrelevant, as the most obvious group structure here is that of panel data, and careful users will declare that and use time series operators to ensure within-panel calculations.


      • #4
        Thanks for the comment, Nick. It is true that there is no general by() option for egen (and tsegen) but it rather depends on the specific function involved. All of my new functions allow this by() option. In fact, in a panel data context it is also needed because otherwise those functions would compute means (or mean deviations) over the whole data set, and thus I am happy that your tsegen command allows to pass it on. (In the example in my previous post, the tsegen command without the by() option would lead to a different and meaningless outcome.)