Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Average of Previous (x) Observations

    Good morning,

    At the moment I need to calculate the average of the 3 previous nmfp for each horse.

    The command ...

    bysort horse (obs_no) : gen LR3 = (nmfp[_n-1] + nmfp[_n-2] + nmfp[_n-3]) / 3

    ....achieves this but creates a problem in rows 1-3 as there are not 3 previous observations.

    How can I amend the formula to calculate the average nmfp for the previous 2 or 1 or no previous observations, if 3 previous observations do not exist?

    Thank you,
    Hans

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str16 race_time str14 horse long obs_no float(nmfp LR3)
    "30/09/2014 14:00" "Primo Capitano" 207640  .4548          .
    "25/10/2014 13:55" "Primo Capitano" 216342  .2582          .
    "28/11/2014 12:15" "Primo Capitano" 226747 -.5222          .
    "16/03/2015 15:10" "Primo Capitano" 252322 -.5222      .0636
    "04/06/2015 20:05" "Primo Capitano" 280266  .0727 -.26206666
    "03/07/2015 17:30" "Primo Capitano" 290791 -.2582     -.3239
    "02/09/2015 15:40" "Primo Capitano" 312979 -.5164     -.2359
    end

  • #2
    You should cross-reference your previous thread. rangestat from SSC will use whatever is available to calculate a mean.

    Comment


    • #3
      Try this:

      Code:
      . bysort horse (obs_no) : gen myLR3 = (sum( nmfp[_n-1]) - sum( nmfp[_n-4]))/(sum(!missing(nmfp[_n-1])) - s
      > um(!missing(nmfp[_n-4])))
      (1 missing value generated)
      
      . list, sep(0)
      
           +-----------------------------------------------------------------------------+
           |        race_time            horse   obs_no     nmfp         LR3       myLR3 |
           |-----------------------------------------------------------------------------|
        1. | 30/09/2014 14:00   Primo Capitano   207640    .4548           .           . |
        2. | 25/10/2014 13:55   Primo Capitano   216342    .2582           .       .4548 |
        3. | 28/11/2014 12:15   Primo Capitano   226747   -.5222           .       .3565 |
        4. | 16/03/2015 15:10   Primo Capitano   252322   -.5222       .0636       .0636 |
        5. | 04/06/2015 20:05   Primo Capitano   280266    .0727   -.2620667   -.2620667 |
        6. | 03/07/2015 17:30   Primo Capitano   290791   -.2582      -.3239      -.3239 |
        7. | 02/09/2015 15:40   Primo Capitano   312979   -.5164      -.2359      -.2359 |
           +-----------------------------------------------------------------------------+

      Comment


      • #4
        Thank you, Joro. Your solution worked and is very much appreciated.

        Comment


        • #5
          You can also use asrol for this.
          Code:
          ssc install asrol
          
          * Create the range variable, that will be used for identify observations within range
          bys horse (obs_no ) : gen n = _n
          
          * Apply asrol to nmfp variable
          asrol nmfp, stat(mean) window(n -4 -1)
               +----------------------------------------------------------------------------------+
               |        race_time            horse   obs_no     nmfp         LR3   n   mean_4_n~p |
               |----------------------------------------------------------------------------------|
            1. | 30/09/2014 14:00   Primo Capitano   207640    .4548           .   1            . |
            2. | 25/10/2014 13:55   Primo Capitano   216342    .2582           .   2    .45480001 |
            3. | 28/11/2014 12:15   Primo Capitano   226747   -.5222           .   3        .3565 |
            4. | 16/03/2015 15:10   Primo Capitano   252322   -.5222       .0636   4        .0636 |
            5. | 04/06/2015 20:05   Primo Capitano   280266    .0727   -.2620667   5   -.26206666 |
               |----------------------------------------------------------------------------------|
            6. | 03/07/2015 17:30   Primo Capitano   290791   -.2582      -.3239   6   -.32389999 |
            7. | 02/09/2015 15:40   Primo Capitano   312979   -.5164      -.2359   7   -.23589999 |
               +----------------------------------------------------------------------------------+
          More on asrol can be found here https://fintechprofessor.com/2017/10...tics-in-stata/

          And details related to option window can be found here https://fintechprofessor.com/asrol-f...option-window/
          Last edited by Attaullah Shah; 10 Jul 2021, 00:17.
          Regards
          --------------------------------------------------
          Attaullah Shah, PhD.
          Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
          FinTechProfessor.com
          https://asdocx.com
          Check out my asdoc program, which sends outputs to MS Word.
          For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

          Comment


          • #6
            To amplify #2:

            Note that times held as strings are of limited use, and even dangerous, as sorting on them will lead to nonsense in your case. The work-around of Attaullah Shah is fine in creating a sequence number.

            The similarity of asrol's syntax here is no accident, as it now allows a window syntax similar to the rangestat syntax. (You may need to update asrol to exploit this syntax._


            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str16 race_time str14 horse long obs_no float(nmfp LR3)
            "30/09/2014 14:00" "Primo Capitano" 207640  .4548          .
            "25/10/2014 13:55" "Primo Capitano" 216342  .2582          .
            "28/11/2014 12:15" "Primo Capitano" 226747 -.5222          .
            "16/03/2015 15:10" "Primo Capitano" 252322 -.5222      .0636
            "04/06/2015 20:05" "Primo Capitano" 280266  .0727 -.26206666
            "03/07/2015 17:30" "Primo Capitano" 290791 -.2582     -.3239
            "02/09/2015 15:40" "Primo Capitano" 312979 -.5164     -.2359
            end
            
            gen double time = clock(race_time, "DMY hm")
            format time %tcDD/NN/CCYY_HH:MM
            bysort horse (time) : gen seq = _n
            rangestat nmfp, int(seq -3 -1)
            
            list
            
                 +-------------------------------------------------------------------------------------------------------+
                 |        race_time            horse   obs_no     nmfp         LR3               time   seq    nmfp_mean |
                 |-------------------------------------------------------------------------------------------------------|
              1. | 30/09/2014 14:00   Primo Capitano   207640    .4548           .   30/09/2014 14:00     1            . |
              2. | 25/10/2014 13:55   Primo Capitano   216342    .2582           .   25/10/2014 13:55     2    .45480001 |
              3. | 28/11/2014 12:15   Primo Capitano   226747   -.5222           .   28/11/2014 12:15     3        .3565 |
              4. | 16/03/2015 15:10   Primo Capitano   252322   -.5222       .0636   16/03/2015 15:10     4        .0636 |
              5. | 04/06/2015 20:05   Primo Capitano   280266    .0727   -.2620667   04/06/2015 20:05     5   -.26206666 |
                 |-------------------------------------------------------------------------------------------------------|
              6. | 03/07/2015 17:30   Primo Capitano   290791   -.2582      -.3239   03/07/2015 17:30     6   -.32389999 |
              7. | 02/09/2015 15:40   Primo Capitano   312979   -.5164      -.2359   02/09/2015 15:40     7   -.23589999 |
                 +-------------------------------------------------------------------------------------------------------+

            Comment

            Working...
            X