Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ASROL : New Version - Speed Advantage

    Thanks to Kit Baum, version 3 of ASROL is available on SSC now. New users can install it by
    Code:
    ssc install asrol,
    and existing user can update the version by
    Code:
    adoupdate asrol
    .

    Description

    asrol calculates descriptive statistics in a user's defined rolling-window. asrol efficiently
    handles all types of data structures such as data declared as time series or panel data,
    undeclared data, or data with duplicate values, missing values or data having time series
    gaps.

    asrol uses efficient codings in the Mata language which makes this version extremely fast as
    compared to other available programs. The speed efficiency matters more in large data sets.
    This version also overcomes limitation of the previous version of asrol which could calculate
    statistics in a rolling window of 104. This new version can accommodate any length of the rolling
    window.

    Speed Comparison
    On panel and time series data, ASROL and RANGESTAT ( from SSC) perform pretty well, with marginal speed efficiency for ASROL. However, ASROL outperforms RANGESTAT significantly in panels with duplicate observations. See results of the following tests using Stata 14.2 on about 5 million observations. I have not tested using different windows as of now.

    Code:
    clear
    set obs 1000
    gen industry=_n
    gen year=_n+1917
    expand 5
    bys industry: gen country=_n
    expand 1000
    bys ind: gen company=_n
    gen profit=uniform()
    tsset company year
    
    timer clear 
    timer on 1
    asrol profit, s(mean) w(100)
    timer off 1
    
    timer on 2
    rangestat (mean) profit, interval(year -99 0) by(company) 
    
    timer off 2
    
    timer list
    /*
      1:    119.22 /        1 =     119.2250
      2:    130.56 /        1 =     130.5600
    */
    
    cap drop mean100_profit profit_mean
    
    timer clear 
    timer on 1
    asrol profit, s(mean) w(year 100) by(country industry)
    timer off 1
    
    timer on 2
    rangestat (mean) profit, interval(year -99 0) by(country industry) 
    
    timer off 2
    assert mean100_profit== profit_mean
    
    timer list
       1:     40.98 /        1 =      40.9840
       2:    681.49 /        1 =     681.4930
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

  • #2
    People interested in the choice between asrol and rangestat (both SSC) for problems in which they both offer solutions may be interested in some comments.

    For the record, Attaullah is sole author of asrol and Robert Picard, Roberto Ferrer and I are joint authors of rangestat. My co-authors are more than welcome to add or subtract comments or emphases: what is said here is on my own responsibility.

    It's a little difficult to discuss some details of asrol's new version. Its Mata code is presented as compiled, so the source code is not accessible. Some choices in asrol are hidden in the code and not documented explicitly in the help. These are just details and fixable.

    rangestat by the way was not written with asrol in mind but from the outset was framed much more generally, in supporting (this is not a complete list)
    • more statistics (in principle, many, many more dependent on the user's willingness to write extra Mata code, or ability to find same); asrol in this version has added further statistics, which is excellent, but as yet does not support a user's own statistics
    • the production of multiple statistics in a single call
    • a wider range of windows (asrol appears limited to windows defined by integer-valued displacements backwards)
    • exclusion of the present observation from calculations (for those many problems in which summaries of the others are needed)
    Ideally of course the extra generality of rangestat should be cost-free, but Attaullah's examples flag an occasion in which it will bite.

    The post above in essence gives two examples, on one of which the two programs give the same results about as quickly. Here is my confirmation of the general pattern of results, split into two. Both examples depend on a dataset created by this code with 5000000 observations. Naturally your random numbers and machine timings may differ.

    Code:
    . clear
    
    . set obs 1000
    number of observations (_N) was 0, now 1,000
    
    . gen industry=_n
    
    . gen year=_n+1917
    
    . expand 5
    (4,000 observations created)
    
    . bys industry: gen country=_n
    
    . expand 1000
    (4,995,000 observations created)
    
    . bys ind: gen company=_n
    
    . gen profit=uniform()
    
    . tsset company year
           panel variable:  company (strongly balanced)
            time variable:  year, 1918 to 2917
                    delta:  1 unit
    
    .
    . timer clear
    
    . timer on 1
    
    . asrol profit, s(mean) w(100)
    
    . timer off 1
    
    .
    . timer on 2
    
    . rangestat (mean) profit, interval(year -99 0) by(company)
    
    .
    . timer off 2
    
    .
    . timer list
       1:     35.73 /        1 =      35.7280
       2:     39.46 /        1 =      39.4620
    
    .
    . cap drop mean100_profit profit_mean
    At first sight, it's a little surprising that Attaullah's syntax for asrol is equivalent to the rangestat syntax as company isn't specified at all. The explanation is that in the circumstances here results are automatically done separately by the panel identifier declared by the earlier tsset. That's an unsurprising default; my suggestion is merely that it is made more explicit in the help.

    Let's turn to the second example, in which I have added some extra statements.
    .
    Code:
    . * NJC extra
    . bysort country industry : gen count = _N
    
    . bysort country industry (year) : gen check = year[1] == year[_N]
    
    . summarize
    
    Variable | Obs Mean Std. Dev. Min Max
    -------------+---------------------------------------------------------
    industry | 5,000,000 500.5 288.675 1 1000
    year | 5,000,000 2417.5 288.675 1918 2917
    country | 5,000,000 3 1.414214 1 5
    company | 5,000,000 2500.5 1443.376 1 5000
    profit | 5,000,000 .4999532 .2886541 1.84e-07 .9999996
    -------------+---------------------------------------------------------
    count | 5,000,000 1000 0 1000 1000
    check | 5,000,000 1 0 1 1
    
    . * end NJC extra
    .
    . timer clear
    
    . timer on 1
    
    . asrol profit, s(mean) w(year 100) by(country industry)
    
    . timer off 1
    
    .
    . timer on 2
    
    . rangestat (mean) profit, interval(year -99 0) by(country industry)
    
    .
    . timer off 2
    
    . * also extra
    . timer on 3
    
    . bysort country industry: egen double another_mean = mean(profit)
    
    . timer off 3
    * end other extra
    
    .
    . timer list
    1: 7.27 / 1 = 7.2730
    2: 156.72 / 1 = 156.7240
    3: 1.98 / 1 = 1.9840
    
    .
    . assert mean100_profit== profit_mean
    
    . assert another_mean == mean100_profit
    Here rangestat is appallingly slow. How embarrassed should we (its authors) be? Not much, is my answer.

    By accident or design, the data here are groupings of company and industry in which year is constant, so the window() or interval() specification is redundant. This bites less with asrol than with rangestat: the latter, with its greater generality, loops robotically over the possibilities, which does slow it down a lot. But this is an extreme problem for which window or interval options are unnecessary. As my extra code shows, even without using Mata, a direct egen solution outperforms either program.

    I am thus grateful for this example, which I find illuminating but not especially troubling. If we find that similar problems arise repeatedly in practice, then we can advise people just to use egen or we can consider revisiting the code to detect this. But neither program is a good or natural choice for problems with windows when those windows are just points.

    Although it's not explicit so far, what asrol does often is rectangularise the data by using its own variant of tsfill internally. With many datasets that changing of the dataset will slow asrol down considerably, so there could be another round of comparisons with such examples. I haven't studied the code or experimented enough to determine whether the dataset change is permanent. If it is, that should be documented.
    Last edited by Nick Cox; 14 Mar 2017, 09:07.

    Comment


    • #3
      asrol gives me the following error

      this is version 12.1 of Stata; it cannot run version 13.0 programs
      You can purchase the latest version of Stata by visiting http://www.stata.com.


      Can you help?
      Last edited by Youshida Koki; 15 Mar 2017, 00:02.

      Comment


      • #4
        Dear Nick Cox
        Let me thank you for your detailed comments and comparison of the two programs. I respect you as my guru in Stata and hence I always receive your comments with deep regards and appreciation.
        You have pointed out that
        Some choices in asrol are hidden in the code and not documented explicitly in the help. These are just details and fixable.
        . I shall add these details to the help file in its next version.

        Also, you have commented
        At first sight, it's a little surprising that Attaullah's syntax for asrol is equivalent to the rangestat syntax as company isn't specified at all.
        While designing asrol, I kept in mind how to spare users from entering too much details in the syntax. If a user has already declared the data as panel or time series data, asrol will capitalize on that and will not require specifying the panel identifier or the time identifier. However, if the data is not declared so, or the user wants statistics on different variables than panel or time identifiers, s/he can do so by specifying desired variables in the option by and option window just like shown in the second example, i.e.
        Code:
        asrol profit, s(mean) w(year 100) by(country industry)
        . Off course, I shall add further explanation in the help file in this regard.

        And finally, you have commented
        By accident or design, the data here are groupings of company and industry in which year is constant, so the window() or interval() specification is redundant. This bites less with asrol than with rangestat: the latter, with its greater generality, loops robotically over the possibilities, which does slow it down a lot. But this is an extreme problem for which window or interval options are unnecessary. As my extra code shows, even without using Mata, a direct egen solution outperforms either program.
        Actually, the second example does not capture what I intended. In practical life, we often face situations in the field of corporate finance where data is available on different dimensions such as yearly data for firms, industries, and countries. So the bysort variables can be company, company and industry, company and country, or industry and country. The data set we generated earlier happen to have year constant in country and industry. Let me share a more real life data set where year is not constant
        https://www.dropbox.com/s/qjhn9fwgx9i5e19/data.dta?dl=0

        Using the above data set, again asrol is at least 14 times faster than rangestat. Unsurprisingly egen cannot produce identical results as produced by asrol or rangestat (Because the year is not constant in the grouping of country and industry)

        Code:
        timer clear
        timer on 1
        asrol profit, s(mean) w(year 5) by(country industry)
        timer off 1
        
        timer on 2
        rangestat (mean) profit, interval(year -4 0) by(country industry)
        
        timer off 2
        assert mean5_profit== profit_mean
        
        end of do-file
        
        . timer list
          
          /*
          1:      1.19 /        1 =       1.1850
          2:     17.06 /        1 =      17.0640
        
        . dis 17.06/1.19
        14.336134
        */
        
        assert another_mean== mean5_profit
        
        306596 contradictions in 306838 observations
        assertion is false
        Last edited by Attaullah Shah; 15 Mar 2017, 00:09.
        Regards
        --------------------------------------------------
        Attaullah Shah, PhD.
        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
        FinTechProfessor.com
        https://asdocx.com
        Check out my asdoc program, which sends outputs to MS Word.
        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

        Comment


        • #5
          Attaullah: Thanks for the comments and extra example.

          Ultimately the only complete documentation of a program's code is the code itself, but users can (and often do) skip documentation that explains more than they want. Personally I lean to documenting important defaults. It would be false modesty to call myself a beginning user and you have it on record that much of the syntax of asrol was a puzzle to me until I looked at the code.

          It's already established that rangestat will sometimes be much slower than asrol and it's easy to concoct examples where rangestat is faster too. The former slowness on occasion is a side-effect of the generality of rangestat.

          In particular, it's my guess that asrol assumes that the window variable is allowed only to have integer values within the observed range, e.g. that with a window from -4 to 0 only displacements of -4 -3 -2 -1 0 are allowed. If so, then asrol can exploit that and fine; it is often a fair assumption for much panel or panel-like data, observed or observable each day, month or year, but the way it is exploited is crucial.

          rangestat makes no such assumption and instead was inspired partly by examples where data are quite irregular in time and/or the window variable takes on fractional values (e.g. for scatter plot smoothing or summary).

          Concretely, what would happen if the window were the previous 100 days and the time values were irregular? For example, clinicians and biostatisticians often want to know something about such windows. Clinic or hospital visits or other times of measurements for one patient might be say 77, 33 and 8 days before the present, with other quite different times for different patients. It's my wild guess that in this circumstance asrol would temporarily construct data for 100, 99, 98, ..., 3, 2, 1 days before the present, most of the extra variables being missing values, typically implying a massive temporary increase in dataset size. Can you confirm or rebut that?

          As before, I can't inspect your Mata code because it is compiled. Detailed comparison of the two programs is inhibited unless the code of
          asrol is fully visible to check on implicit as well as explicit assumptions.

          Naturally, it's fine to write programs that work well for the problems you care about. I do that too, but users might want to beware of exaggerated claims based on a narrow range of examples.

          Incidentally, you don't give the egen code you used, but it is not a valid criticism that it produces wrong results if the code doesn't match the problem.
          Last edited by Nick Cox; 15 Mar 2017, 02:35.

          Comment


          • #6
            Youshida #3: Your problem is discussed generically at http://www.stata.com/support/faqs/pr...stata-version/

            Comment


            • #7
              Dear Nick, thanks again for your reply and time.

              You commented that
              Concretely, what would happen if the window were the previous 100 days and the time values were irregular? For example, clinicians and biostatisticians often want to know something about such windows. Clinic or hospital visits or other times of measurements for one patient might be say 77, 33 and 8 days before the present, with other quite different times for different patients. It's my wild guess that in this circumstance asrol would temporarily construct data for 100, 99, 98, ..., 3, 2, 1 days before the present, most of the extra variables being missing values, typically implying a massive temporary increase in dataset size. Can you confirm or rebut that? ... I do that too, but users might want to beware of exaggerated claims based on a narrow range of examples.
              asrol uses two different approaches to handle regular and irregular data sets. I have tried to subject the program to different tests under different data structures and so far it has done fairly well. I shall be more interested in examples of data sets where it unnecessarily increases the data set and performs poorly. So far, I have not encountered such a situation. As usual, I would really appreciate your input here to make it useful to a wide range of users.

              Incidentally, you don't give the egen code you used, but it is not a valid criticism that it produces wrong results if the code doesn't match the problem.
              I forgot to add the egen code
              Code:
               bysort country industry : egen double another_mean = mean(profit)
              I did not blame egen. I simply tried to reply to your earlier example that egen cannot be used where year is not constant in groupings of industry and country. So my text reads as follows, with emphasis on Unsurprisingly ... and (Because the year is not constant in the grouping of country and industry)

              Using the above data set, again asrol is at least 14 times faster than rangestat. Unsurprisingly egen cannot produce identical results as produced by asrol or rangestat (Because the year is not constant in the grouping of country and industry)
              Regards
              --------------------------------------------------
              Attaullah Shah, PhD.
              Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
              FinTechProfessor.com
              https://asdocx.com
              Check out my asdoc program, which sends outputs to MS Word.
              For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

              Comment


              • #8
                We agree on egen: it's not relevant for your new example because it can't handle non-point windows, at least not directly, but that's the case to make. There was no value to running the comparison.

                Moving forward: I don't think you've confirmed or rebutted my guess or explained precisely what asrol does exactly with irregular data. It's futile for me to try to guess further while your Mata code is invisible. That's a programmer's choice, but the rest of us are largely limited to comparing performance while that is so. But I'll concoct plausible examples so that we can see how it works in other quite different problems.

                Comment


                • #9
                  Thanks Nick for your time. I shall really appreciate further example data sets to test both the programs.
                  Regards
                  --------------------------------------------------
                  Attaullah Shah, PhD.
                  Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                  FinTechProfessor.com
                  https://asdocx.com
                  Check out my asdoc program, which sends outputs to MS Word.
                  For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                  Comment


                  • #10
                    Here we go. We imagine (simulate) 100000 patients visiting a clinic at irregular times within a two-year period. I count the number of visits in a window of 100 days up to and including the present.

                    Incidentally. I was going to make the comparison for the previous 100 days, strict sense, i.e. interval(date -100 -1) in rangestat terms, but I can't see that asrol has a way to specify windows that don't end with the present observation. That could be done indirectly by shifting the time variable, but I didn't bother with that.

                    Code:
                    . clear
                    
                    . timer clear
                    
                    . set seed 2803
                    
                    . set obs 100000
                    number of observations (_N) was 0, now 100,000
                    
                    . gen id = _n
                    
                    . gen nvisits = ceil(20 * sqrt(runiform()))
                    
                    . expand nvisits
                    (1,281,108 observations created)
                    
                    . gen date = mdy(1,1,2015) + floor(731 * runiform())
                    
                    .
                    . timer on 1
                    
                    . rangestat (count) date, interval(date -99 0) by(id)
                    
                    . timer off 1
                    
                    .
                    . timer on 2
                    
                    . asrol date, window(date 100) by(id) stat(count)
                    
                    . timer off 2
                    
                    .
                    . timer list
                       1:      7.03 /        1 =       7.0260
                       2:    162.11 /        1 =     162.1080
                    
                    .
                    . assert date_count==count100_date
                    Here's the code in convenient form.

                    Code:
                    clear
                    timer clear
                    set seed 2803
                    set obs 100000
                    gen id = _n
                    gen nvisits = ceil(20 * sqrt(runiform()))
                    expand nvisits
                    gen date = mdy(1,1,2015) + floor(731 * runiform())
                    
                    timer on 1
                    rangestat (count) date, interval(date -99 0) by(id)
                    timer off 1
                    
                    timer on 2
                    asrol date, window(date 100) by(id) stat(count)
                    timer off 2
                    
                    timer list
                    
                    assert date_count==count100_date
                    In short, the speed advantage is reversed here. So, I have to suggest that your statement in #1

                    asrol uses efficient codings in the Mata language which makes this version extremely fast as
                    compared to other available programs.

                    looks to me more suitably phrased as

                    asrol uses efficient codings in the Mata language which makes this version extremely fast as
                    compared with other available programs for some problems, although for quite different problems the
                    opposite is true.
                    Last edited by Nick Cox; 15 Mar 2017, 04:46.

                    Comment


                    • #11
                      Potential users of both programs would benefit from a discussion of the specific situations under which one program outperforms the other. If the two of you invest more time in following this, could we have a summary of the criteria for performance listed here?

                      If the specific data characteristics that favor one or the other command could be diagnosed automatically, it would also be a nice thing to have a wrapper program that chooses to call asrol or rangestat, respectively.

                      Best
                      Daniel

                      Comment


                      • #12
                        Thanks Daniel, that is a nice suggestion. @Nick, you example is interesting, let me look into it.
                        Last edited by Attaullah Shah; 15 Mar 2017, 05:44.
                        Regards
                        --------------------------------------------------
                        Attaullah Shah, PhD.
                        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                        FinTechProfessor.com
                        https://asdocx.com
                        Check out my asdoc program, which sends outputs to MS Word.
                        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                        Comment


                        • #13
                          Daniel: In practice the wrapper would be much more work than anyone might want to undertake.

                          You might need to examine the data itself, which in large datasets, where these differences can bite, would itself be time-consuming. You don't want to burn up minutes of machine time to decide which program takes 10 s. Or, one could require the user to make statements about their data that they might not be able to answer. You would also need a complete understanding of the assumptions made by the two programs, which at this moment is only attainable by Attaullah as he can see all our code, but not vice versa. There are some other downsides, but those are enough to make the project unappealing to me.

                          No; I think the best way forward is through reproducible examples of realistic problems which stretch either program. This is where other people can be really helpful. (As a very famous R user once said, as I recall: User support. Our idea of user support is that the users should support the programmers by finding the bugs.)

                          I'll permit myself two conjectures. I'm Popperian enough to seek progress by refutations

                          1. Scope. rangestat can do anything asrol can do, but the converse isn't true.

                          2. Speed for problems where both can be used. rangestat may well be slower for very large (nearly) balanced panels or similar datasets. For irregular spacing in time or other window variable, the converse may well be true.
                          Last edited by Nick Cox; 15 Mar 2017, 06:23.

                          Comment


                          • #14
                            The example used in #1 to claim that asrol is faster than rangestat was nonsensical. In #4, Attaullah concedes the point and proposes another dataset that he claims presents a real life situation where year is not constant. The data includes profits per country and industry over a range of years. However, he does not make it clear that there are multiple observations per group of country industry year, up to 701 in one group:

                            Code:
                            . * from https://www.dropbox.com/s/qjhn9fwgx9i5e19/data.dta?dl=0
                            . use "data.dta", clear
                            
                            . rename PROF profit
                            
                            . qui tab country
                            
                            . dis "numer of countries = " r(r)
                            numer of countries = 69
                            
                            . qui tab industry
                            
                            . dis "numer of industry = " r(r)
                            numer of industry = 61
                            
                            . qui tab year
                            
                            . dis "numer of years = " r(r)
                            numer of years = 17
                            
                            . 
                            . * why do many observations? Up to 710 observations in the same
                            . * -country industry year- group
                            . sum
                            
                                Variable |        Obs        Mean    Std. Dev.       Min        Max
                            -------------+---------------------------------------------------------
                                    year |    306,838    2005.141    4.430404       1997       2013
                                 country |          0
                                  profit |    303,259    .0340184    .1669237  -1.999371    4.22261
                                industry |    306,748    39.79098    19.23801         10         89
                            
                            . bysort country industry year: gen N = _N
                            
                            . sum N
                            
                                Variable |        Obs        Mean    Std. Dev.       Min        Max
                            -------------+---------------------------------------------------------
                                       N |    306,838    99.61557    127.1479          1        710
                            Given the structure of the data, I do not understand the logic of calculating a statistic over a rolling window of time in this case.
                            And all observations in the same year will have the same results within the country industry groups.

                            rangestat calculates statistics individually for each observation using other observations that are within a certain range of the current observation. The range is not restricted to a fixed window and each observation defines its own interval. Since rangestat calculates statistics per observation, the example proposed by Attaullah in #4 requires that rangestat redo the same calculation, for the same set of observations, up to 710 times (and on average 99 times per country industry year group).

                            So it's not that rangestat is inefficient here, it's that Attaullah is using it wrong. The same way that egen could be used match the results in #1, you could use it again with these new data. Better yet, you can use collapse to reduce the data to one observation per country industry year group and then calculate the mean using rangestat or tsegen. While I'm at it, I'll also take the opportunity to show how to do this efficiently with rangestat by calculating the statistic for only one observation per country industry year group.
                            Code:
                            * from https://www.dropbox.com/s/qjhn9fwgx9i5e19/data.dta?dl=0
                            use "data.dta", clear
                            rename PROF profit
                            
                            timer clear
                            timer on 3
                            collapse (sum) psum = profit (count) nobs = profit, by(country industry year)
                            rangestat (sum) psum nobs, interval(year -4 0) by(country industry)
                            gen double pmean = psum_sum / nobs_sum
                            timer off 3
                            
                            merge 1:m country industry year using "data.dta", assert(match) nogen
                            rename PROF profit
                            
                            timer on 1
                            asrol profit, s(mean) w(year 5) by(country industry)
                            timer off 1
                            
                            timer on 2
                            bysort country industry year: gen skip = _n > 1
                            gen low = cond(skip, 0, year-4)
                            gen high = cond(skip, 0, year)
                            rangestat (mean) profit, interval(year low high) by(country industry)
                            bysort country industry year (skip): replace profit_mean = profit_mean[1]
                            timer off 2
                            
                            assert mean5_profit== profit_mean
                            assert mean5_profit== pmean
                            
                            timer list
                            and the timing results on my computer:
                            Code:
                            . timer list
                               1:      0.76 /        1 =       0.7620
                               2:      0.92 /        1 =       0.9230
                               3:      0.28 /        1 =       0.2850
                            The timing results show that the collapse and rangestat solution is the fastest. The other two are about the same notwithstanding the extra overhead that rangestat has to go through to skip the calculation for duplicate observations.

                            I agree with Nick's summary that rangestat is a more capable and versatile program. rangestat uses an extremely efficient algorithm to locate the set of observations that fall within the specified range for the current observation. The only situation where asrol has a chance to outperform rangestat is if the data is fully rectangular from the get go since that is no different from locating a matrix element using a specific row and column.

                            Clearly Attaullah wants to make a contribution with asrol. That's great but his claims of a plus value here are not supported by the examples he proposed.

                            Comment


                            • #15
                              At the risk of boring our readers, I would like to add one last post to this thread. While writing asrol, I had specifically finance folks in mind. And I am still confident that asrol is the best to offer in the field of finance and allied disciplines. The specific example that Nick posted from patient visits is a peculiar case of too many gaps in the time series. I tend to agree with Nick that in such cases asrol lags.

                              Robert Picard
                              The example used in #1 to claim that asrol is faster than rangestat was nonsensical. In #4, Attaullah concedes the point and proposes another dataset that he claims presents a real life situation where year is not constant.
                              . The second example was borrowed from one of my research paper where I used data of 40000 firms from 69 countries and 60 industries. The extra bit of code that Robert has generated to make rangestat comparable with asrol in terms of efficiency is understandable. But how many users can do that. To me, the primary objective of a program is to avoid writing such codes in the first place
                              Regards
                              --------------------------------------------------
                              Attaullah Shah, PhD.
                              Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                              FinTechProfessor.com
                              https://asdocx.com
                              Check out my asdoc program, which sends outputs to MS Word.
                              For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                              Comment

                              Working...
                              X