Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help to get rolling average (standard deviation) over 5 years

    Hello,

    I have a panel data like this (unbalanced)

    ID Year V1 V2
    0 2005 30 19
    0 2006 20 67
    0 2007 4 54
    0 2008 5 32
    0 2009 17 98
    0 2010 18 6
    0 2011 9 74
    1 2006 20 67
    1 2007 4 54
    1 2008 5 32
    1 2009 17 98
    1 2010 18 6
    1 2011 9 74
    2 2006 20 67
    2 2007 4 54
    2 2008 5 32
    2 2009 17 98
    2 2010 18 6
    2 2011 9 74

    Could any one help me to get the rolling Standard deviation over 5 years for V1, and the rolling mean of V2 over 5 years?

  • #2
    This is the sort of thing you can do with rangestat from SSC. My example is the other way round from yours, but that's just immaterial. I have done separate calculations for each identifier, but don't do that if you want to pool panels.

    For more details read its help or search this forum for mentions of rangestat.


    Code:
         +---------------------------------------------+
         | ID   Year   V1   V2     V1_mean       V2_sd |
         |---------------------------------------------|
      1. |  0   2005   30   19          30           . |
      2. |  0   2006   20   67          25   33.941125 |
      3. |  0   2007    4   54          18   24.826062 |
      4. |  0   2008    5   32       14.75   21.556128 |
      5. |  0   2009   17   98        15.2   30.878795 |
      6. |  0   2010   18    6        12.8   34.853981 |
      7. |  0   2011    9   74        10.6   35.765906 |
         |---------------------------------------------|
      8. |  1   2006   20   67          20           . |
      9. |  1   2007    4   54          12   9.1923882 |
     10. |  1   2008    5   32   9.6666667   17.691806 |
     11. |  1   2009   17   98        11.5   27.584718 |
     12. |  1   2010   18    6        12.8   34.853981 |
     13. |  1   2011    9   74        10.6   35.765906 |
         |---------------------------------------------|
     14. |  2   2006   20   67          20           . |
     15. |  2   2007    4   54          12   9.1923882 |
     16. |  2   2008    5   32   9.6666667   17.691806 |
     17. |  2   2009   17   98        11.5   27.584718 |
     18. |  2   2010   18    6        12.8   34.853981 |
     19. |  2   2011    9   74        10.6   35.765906 |
         +---------------------------------------------+

    Comment


    • #3
      Originally posted by Nick Cox View Post
      This is the sort of thing you can do with rangestat from SSC. My example is the other way round from yours, but that's just immaterial. I have done separate calculations for each identifier, but don't do that if you want to pool panels.

      For more details read its help or search this forum for mentions of rangestat.


      Code:
      +---------------------------------------------+
      | ID Year V1 V2 V1_mean V2_sd |
      |---------------------------------------------|
      1. | 0 2005 30 19 30 . |
      2. | 0 2006 20 67 25 33.941125 |
      3. | 0 2007 4 54 18 24.826062 |
      4. | 0 2008 5 32 14.75 21.556128 |
      5. | 0 2009 17 98 15.2 30.878795 |
      6. | 0 2010 18 6 12.8 34.853981 |
      7. | 0 2011 9 74 10.6 35.765906 |
      |---------------------------------------------|
      8. | 1 2006 20 67 20 . |
      9. | 1 2007 4 54 12 9.1923882 |
      10. | 1 2008 5 32 9.6666667 17.691806 |
      11. | 1 2009 17 98 11.5 27.584718 |
      12. | 1 2010 18 6 12.8 34.853981 |
      13. | 1 2011 9 74 10.6 35.765906 |
      |---------------------------------------------|
      14. | 2 2006 20 67 20 . |
      15. | 2 2007 4 54 12 9.1923882 |
      16. | 2 2008 5 32 9.6666667 17.691806 |
      17. | 2 2009 17 98 11.5 27.584718 |
      18. | 2 2010 18 6 12.8 34.853981 |
      19. | 2 2011 9 74 10.6 35.765906 |
      +---------------------------------------------+
      Could you share the code you use? Another thing is the beginning should be the value? If yes, why you use more than 5 years in calculating

      Comment


      • #4
        Sorry, that missed the boat.

        Code:
        rangestat (mean) V1 (sd) V2, int(Year -4 0) by(ID)
        Where you want to place windows and what you want to do with windows with fewer than 5 years' values are your call.

        Comment


        • #5
          1
          Last edited by Emad Noureldeen; 25 Aug 2021, 11:26.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            This is the sort of thing you can do with rangestat from SSC. My example is the other way round from yours, but that's just immaterial. I have done separate calculations for each identifier, but don't do that if you want to pool panels.

            For more details read its help or search this forum for mentions of rangestat.


            Code:
            +---------------------------------------------+
            | ID Year V1 V2 V1_mean V2_sd |
            |---------------------------------------------|
            1. | 0 2005 30 19 30 . |
            2. | 0 2006 20 67 25 33.941125 |
            3. | 0 2007 4 54 18 24.826062 |
            4. | 0 2008 5 32 14.75 21.556128 |
            5. | 0 2009 17 98 15.2 30.878795 |
            6. | 0 2010 18 6 12.8 34.853981 |
            7. | 0 2011 9 74 10.6 35.765906 |
            |---------------------------------------------|
            8. | 1 2006 20 67 20 . |
            9. | 1 2007 4 54 12 9.1923882 |
            10. | 1 2008 5 32 9.6666667 17.691806 |
            11. | 1 2009 17 98 11.5 27.584718 |
            12. | 1 2010 18 6 12.8 34.853981 |
            13. | 1 2011 9 74 10.6 35.765906 |
            |---------------------------------------------|
            14. | 2 2006 20 67 20 . |
            15. | 2 2007 4 54 12 9.1923882 |
            16. | 2 2008 5 32 9.6666667 17.691806 |
            17. | 2 2009 17 98 11.5 27.584718 |
            18. | 2 2010 18 6 12.8 34.853981 |
            19. | 2 2011 9 74 10.6 35.765906 |
            +---------------------------------------------+
            I highlight the values which I asked about. For example, why the V1_mean is 12.8 in 2010 for ID 0, 10.6 in 2011 for ID 1, and 10.6 in 2011 for ID 1, instead of 6, as we start a new period of 5 years ? The same also for V2_sd, I'm not sure you follow the same why that you used for mean (i.e., without starting new period of 5 years) or not.
            For the window fewer than 5 years I want to exclude them from the sample, meaning that each company should have at least 5 years in total.

            Comment


            • #7
              As the syntax shows I am using windows from -4 to 0 with respect to year, so they extend from year - 4 to year + 0 = year. So for example 12.8 is the mean of 5 values as shown independently by

              Code:
              . mata : mean((20,4,5,17,18)')
                12.8
              Otherwise please do study the help for rangestat. rangestat is maximal in showing results. You can add a twist by (say)


              Code:
              clear 
              input ID Year V1 V2
              0 2005 30 19
              0 2006 20 67
              0 2007 4 54
              0 2008 5 32
              0 2009 17 98
              0 2010 18 6
              0 2011 9 74
              1 2006 20 67
              1 2007 4 54
              1 2008 5 32
              1 2009 17 98
              1 2010 18 6
              1 2011 9 74
              2 2006 20 67
              2 2007 4 54
              2 2008 5 32
              2 2009 17 98
              2 2010 18 6
              2 2011 9 74
              end 
              
              rangestat (count) V1 (mean) V1, int(Year -4 0) by(ID)
              and then focusing on results for which V1_count is 5.

              Comment

              Working...
              X