Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sum of several rows

    Hi all,

    The original data set has the first two columns (t, var).
    t var sum1 sum2
    1 3 12 12
    2 4 12 10
    3 5 12 8
    4 1 11 11
    5 2 11 12
    6 8 11 19
    7 2 12 12
    8 9 12
    9 1 12
    In the third column, I want to get the sum of var every 3 rows, so for t=1-3, sum1=3+4+5=12, for t=4-6, sum1=1+2+8=11. The next column calculates the overlapping sum of var every 3 rows, so for t=1, sum2=3+4+5=12, for t=2, sum2=4+5+1=10.

    I choose to sum every 3 rows just for display, it can be quite large like 30. Thus, codes like "gen sum2=var+var[_n+1]+var[_n+2]" are not the best solution here.

    How do I generate sum1 and sum2?
    Thanks a lot!

  • #2
    The best tool for this is the community contributed rangestat command available from SSC. You can install it (you need only do this once) by running
    Code:
    ssc install rangestat
    help rangestat
    and reading the output of the help command to get an idea of what it can do for you. Here's how to apply it to your example data.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(t var sum1 sum2)
    1 3 12 12
    2 4 12 10
    3 5 12  8
    4 1 11 11
    5 2 11 12
    6 8 11 19
    7 2 12 12
    8 9 12  .
    9 1 12  .
    end
    
    generate low  = _n-mod(_n-1,3)
    generate high = _n+2-mod(_n-1,3)
    rangestat (sum) want1=var, interval(t, low, high )
    rangestat (sum) want2=var, interval(t 0 2)
    replace want2 = . in -2/l
    drop low high
    list, clean
    Code:
    . list, clean 
    
           t   var   sum1   sum2   want1   want2  
      1.   1     3     12     12      12      12  
      2.   2     4     12     10      12      10  
      3.   3     5     12      8      12       8  
      4.   4     1     11     11      11      11  
      5.   5     2     11     12      11      12  
      6.   6     8     11     19      11      19  
      7.   7     2     12     12      12      12  
      8.   8     9     12      .      12       .  
      9.   9     1     12      .      12       .

    Comment


    • #3
      William Lisowski gives excellent advice. Here's a variant on his solution.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(t var sum1 sum2)
      1 3 12 12
      2 4 12 10
      3 5 12  8
      4 1 11 11
      5 2 11 12
      6 8 11 19
      7 2 12 12
      8 9 12  .
      9 1 12  .
      end
      
      gen T = ceil(t/3)
      rangestat (sum) wanted1=var, int(T 0 0)
      rangestat (count) count2=var (sum) wanted2=var, int(t 0 2)
      replace wanted2 = . if count2 < 3
      
      list, sepby(T)
      
           +--------------------------------------------------------+
           | t   var   sum1   sum2   T   wanted1   count2   wanted2 |
           |--------------------------------------------------------|
        1. | 1     3     12     12   1        12        3        12 |
        2. | 2     4     12     10   1        12        3        10 |
        3. | 3     5     12      8   1        12        3         8 |
           |--------------------------------------------------------|
        4. | 4     1     11     11   2        11        3        11 |
        5. | 5     2     11     12   2        11        3        12 |
        6. | 6     8     11     19   2        11        3        19 |
           |--------------------------------------------------------|
        7. | 7     2     12     12   3        12        3        12 |
        8. | 8     9     12      .   3        12        2         . |
        9. | 9     1     12      .   3        12        1         . |
           +--------------------------------------------------------+

      Comment


      • #4
        Alternatively, and without the use of external commands, but with Stata's built-in only:
        Code:
        tsset t
        tssmooth ma sm1=var, window(0 1 2)
        replace sm1=sm1*3
        replace sm1=. in -2/L
        Note also, that if you are doing this to calculate the moving average, than tssmooth ma is your best friend.

        PS: easily generalizable for any window, just make sure the numbers highlighted in BLUE, are 1 smaller than the number highlighted in GREEN.

        Sincerely, Sergiy

        Comment

        Working...
        X