Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I create a cumulative index?

    Hello,

    I am trying to perform an operation which in Excel or Python would be very easy. However, I have tried different approaches but can't figure out how to do it in Stata.

    This is the pattern of my data looks, in reality with many more groups and observations:
    Group Counter Deduct Desired Auxiliary Variable Desired Output
    A 1 0 0 1
    A 2 0 0 2
    A 3 1 1 2
    A 4 0 1 3
    A 5 0 1 4
    A 6 0 1 5
    A 7 0 1 6
    A 8 1 2 6
    A 9 0 2 7
    A 10 0 2 8
    A 11 0 2 9
    A 12 1 3 9
    B 1 1 1 0
    B 2 0 1 1
    B 3 0 1 2
    B 4 1 2 2
    B 5 0 2 3
    ... ... ... ... ...
    The three columns on the left are what I have, the fourth column is a kind of cumulative index which I tried to create in order to reach my desired output as in the fifth column. The fifth column is simply the second minus the fourth column.

    My first thought was to write
    Code:
    gen desired_aux_variable = 0,
    replace desired_aux_variable = desired_aux_variable + 1 if deduct ==1 else desired_aux_variable = desired_aux_variable[_n-1]
    but then I learned about the difference between if commands and if qualifiers and saw that the latter don't allow for an else statement. I then looked for a way to process the observations line by line but I read that such an approach is very untypical for Stata.

    Which other approach can I use in this situation?


  • #2
    Hi Pachakutik,

    This post may help you out or give you some ideas: https://www.statalist.org/forums/for...ulative-return

    Comment


    • #3
      Thanks for your data example, which is helpful. Note the recommended way to give a data example, as below, which is even more helpful. See also https://www.statalist.org/forums/help#stata


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str1 group byte(counter deduct)
      "A"  1 0
      "A"  2 0
      "A"  3 1
      "A"  4 0
      "A"  5 0
      "A"  6 0
      "A"  7 0
      "A"  8 1
      "A"  9 0
      "A" 10 0
      "A" 11 0
      "A" 12 1
      "B"  1 1
      "B"  2 0
      "B"  3 0
      "B"  4 1
      "B"  5 0
      end
      
      . bysort group (counter) : gen wanted = sum(deduct)
      
      . gen desired = counter - wanted
      
      . list, sepby(group)
      
           +---------------------------------------------+
           | group   counter   deduct   wanted   desired |
           |---------------------------------------------|
        1. |     A         1        0        0         1 |
        2. |     A         2        0        0         2 |
        3. |     A         3        1        1         2 |
        4. |     A         4        0        1         3 |
        5. |     A         5        0        1         4 |
        6. |     A         6        0        1         5 |
        7. |     A         7        0        1         6 |
        8. |     A         8        1        2         6 |
        9. |     A         9        0        2         7 |
       10. |     A        10        0        2         8 |
       11. |     A        11        0        2         9 |
       12. |     A        12        1        3         9 |
           |---------------------------------------------|
       13. |     B         1        1        1         0 |
       14. |     B         2        0        1         1 |
       15. |     B         3        0        1         2 |
       16. |     B         4        1        2         2 |
       17. |     B         5        0        2         3 |
           +---------------------------------------------+
      From a Stata point of view, the first key point is that calculations must be done separately by group, which then suggests the use of by:

      Then what you want is a cumulative sum and finally a difference.

      Sometimes having programmed previously in a programming language, especially any language whose people regard as a mainstream programming language, can be a distraction.

      Possible reading:


      https://journals.sagepub.com/doi/pdf...867X0200200106

      https://journals.sagepub.com/doi/pdf...867X1101100308



      Comment


      • #4
        Thank you for your answers, they were really helpful.

        The sum() function was the command I was missing.

        Thanks for your data example, which is helpful. Note the recommended way to give a data example, as below, which is even more helpful. See also https://www.statalist.org/forums/help#stata
        Thanks for the hint, I will use that format in the future.

        Comment


        • #5
          Good, but see the second paper linked in #2 to know that in Stata functions and commands are different beasts. This is more than mere terminology, as functions and commands are documented separately and behave differently, Hence sum() is a function and not a command.

          Comment

          Working...
          X