Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to sum a sequence and generate a new variable which is the the sume of the sequence

    dear forum users,

    I have a problem, a have panel data and I need to delete time periods which do not have 24 month consecutive observations in my case returns (variable). I made via tsspell a variable which shows the sequence and starts all over again if the sequence is interrupted by a missing variable. Now I would like to count the sequence and report beside the sequence the total sum from the beginning of the sequence, how could I achieve this. Note the sequence starts also all over again within the panels. so I need this
    sequence new varaible sum of sequence but for each sequence
    1 10
    2 10
    3 10
    4 10
    1 15
    2 15
    3 15
    4 15
    5 15

  • #2
    bysort sequence: egen total = total(x)

    Comment


    • #3
      dear Brendan that does not work, it does not give me the sum for each different sequences in a panel, it simply gives me the sum of all sequences, my example above is just one panel, wich has different sequences inside.

      Comment


      • #4
        You're going to have to work harder at making your questions clear, but see if this example helps:
        Code:
        . input id t sequence
        
                    id          t   sequence
          1. 1 1 1
          2. 1 2 2
          3. 1 3 3
          4. 1 4 4
          5. 1 5 1
          6. 1 6 2
          7. 1 7 3
          8. 1 8 4
          9. 1 9 5
         10. 2 1 1
         11. 2 2 2
         12. 2 3 3
         13. 2 4 4
         14. 2 5 5
         15. 2 6 6
         16. 2 7 7
         17. 2 8 1
         18. 2 9 2
         19. 2 10 3
         20. 2 11 4
         21. end
        
        .
        . sort id t
        
        .
        . gen spellno = 1
        
        . by id: replace spellno = spellno[_n-1] + (sequence==1) if _n>1
        (9 real changes made)
        
        .
        . sort id spellno sequence
        
        . by id spellno: egen x = total(spellno)
        
        . list
        
             +-----------------------------------+
             | id    t   sequence   spellno    x |
             |-----------------------------------|
          1. |  1    1          1         1    4 |
          2. |  1    2          2         1    4 |
          3. |  1    3          3         1    4 |
          4. |  1    4          4         1    4 |
          5. |  1    5          1         2   10 |
             |-----------------------------------|
          6. |  1    6          2         2   10 |
          7. |  1    7          3         2   10 |
          8. |  1    8          4         2   10 |
          9. |  1    9          5         2   10 |
         10. |  2    1          1         1    7 |
             |-----------------------------------|
         11. |  2    2          2         1    7 |
         12. |  2    3          3         1    7 |
         13. |  2    4          4         1    7 |
         14. |  2    5          5         1    7 |
         15. |  2    6          6         1    7 |
             |-----------------------------------|
         16. |  2    7          7         1    7 |
         17. |  2    8          1         2    8 |
         18. |  2    9          2         2    8 |
         19. |  2   10          3         2    8 |
         20. |  2   11          4         2    8 |
             +-----------------------------------+
        
        .
        The key is creating the spell-number variable, which increments every time sequence==1.

        Comment


        • #5
          thanks Brendan for putting so much effort in this, but maybe I am not seeing the forest due to the trees, so if you could please rewrite it that way that I get for every sequence the sum beside or the sequence steps(incremens) or the highest sequence value, you see for the first id first sequence this is great, you get the highest number of the sequence 4, but the second sequence's x is 10??? If it would it would be perfect, I am really a laymen regarding stata I am sorry, but it would really nice if you could change that. I mean the purpose is that I can tell stata drop if x<24, in that way windows of 24months with missing values, I NEED THIS TO RUN ROLLING REGRESSIONS

          Comment


          • #6
            How would you write this in stata code/commands, keep if the return at the previous 12 months t-12 is not missing and return at the future 12 month is not missing. shouldn't that delete every observation that does not have this 24 window of full returns?

            Comment


            • #7
              The line
              . by id spellno: egen x = total(spellno)

              should be

              . by id spellno: egen x = total(sequence)

              to correspond with your earlier example.

              The code identifies a spell-number variable that increments every time sequence is 1 (since this is how your example worked). Taking each spell where the sequence number is rising as a unit, the egen command computes the total (e.g, 1+2+3+4 = 10).

              Comment


              • #8
                Brendan Thanks you very much this does the trick, great, however I just tried to run rollreg(user written command from ssc) over the data that has missing values, and it still runs, don't know if it identifies the missing values itself and just does not run the regression or will it give me rubbish as output? And how could I run rollreg but that it does not write over the existing .dta file, but that it rather saves the betas in a different file?

                Comment

                Working...
                X