No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • ASROL Update : Multiple Statistics and Multiple Variables in one Go

    The discussion after asrol version 4.0 updates [See this thread ] led me to believe that features such multiple statistics for multiple variables can be a useful. River Huang , daniel klein and William Lisowski raised important issues related to calculation of multiple statistics for multiple variables. In this post, I would like to announce the release of version 4.2.0. Following are the details and examples related to this update.

    New users can install it by
     ssc install asrol
    and existing users can update the version by
     adoupdate asrol, update

    asrol generates rolling window / group descriptive statistics

    Version Updates

    The key features added to this version of asrol are as follows:

    1. Support for multiple variables

    asrol now supports calculation of required statistics for multiple variables in one go. See examples below.

    2. Support for multiple statistics

    This version also supports calculation of more than one statistic in one go, see examples below

    By-Group Statistics
    The first two version of asrol only supported rolling-window calculations. From version 3 onward, asrol now also supports statistics without a rolling window. Therefore, it can be used just like egen command for mean, geometric mean, median, minimum, maximum, count, standard deviations, percentiles, count of missing values, product, first, last and sum.

    With this version update, the option of group statistics is now more attractive. For multiple variables or statistics, one has to either type the egen command several times or use a loop. This is not the case with asrol. We can use asrol to find a number of statistics for several variables in one go (see examples below).

    To see the version updates in action with speed efficiency of asrol, let us create a data set of one million observation, with 10000 firm IDs and 100 period for each firm. We shall also create three random variables with names X1, X2, and X3.

    set obs 1000000
    gen id=mod(_n,10000)+1
    bys id: gen year=_n+1917
    gen X1 = uniform()
    gen X2 = uniform()
    gen X3 = uniform()
    Example 1: Find mean of X1 for each firm in a rolling window of 20 years
    bys id: asrol X1, stat(mean) w(year 20)
    Example 2: Find mean of X1, X2, and X3 for each firm in a rolling window of 20 years
    bys id: asrol X1 X2 X3, stat(mean) w(year 20)
    Example 3: Find mean, standard deviation and count of X1 for each firm in a rolling window of 20 years
    bys id: asrol X1, stat(mean sd count) w(year 20)
    Example 4: Find mean, standard deviation and count of X1, X2, and X3 for each firm in a rolling window of 20 years
    bys id: asrol X1 X2 X3, stat(mean sd count) w(year 20)

    Comparison with EGEN for group statistics
    timer clear
    timer on 1
    bys id : asrol X1 X2 X3, stat(sd mean median count max)
    timer off 1
    local stat "sd mean median count max"
    timer on 2
    foreach v of varlist X1 X2 X3{
    foreach s of local stat{
    bys id: egen double E`s'_`v' = `s'(`v')
    } timer off 2 timer list . timer list 1: 18.80 / 1 = 18.7990 2: 52.12 / 1 = 52.1240
    Instead of typing X1, X2, X3, we can also use wild cards. For example,
    bys id :  asrol X*, stat(sd mean median count max)
    For the above tests, I used Stata 13 MP, Windows 10, 64 bit, 5GB Ram, Intel(R) Core (TM)2 CPU 6400 @ 2.13 GHz

    Attaullah Shah, PhD.
    Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    If you use MS Word, do check my asdoc program that easily sends Stata output to MS Word

  • #2
    Many thanks, Attaullah.
    Ho-Chuan (River) Huang
    Stata 15.1, MP(4)