Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • average for specific range of years

    Hi everbody,

    I have two variables, one is year (a list of years from 1900 to 2000), the other is earnings (contains a numerical values for each year)
    Is there any way to generate a third variable with the average of earnings which only includes values coresponding to years after the respective one?

    so that, for example, the value of the third variable for the year 1995 is the average of earnings from 1995 - 2000.

    Thanks!

  • #2
    If I understand this correctly what you want is fairly easy with a loop if you just reverse time first -- but easier still with rangestat from SSC. Here the averages shown are for years after (your first wording) not for this year and after (your second wording).

    Code:
    . webuse grunfeld, clear
    
    . rangestat (count) invest (mean) invest, int(year 1 .) by(company)
    
    . 
    . l invest*  year if company == 1
    
         +--------------------------------------+
         | invest   invest~t   invest_~n   year |
         |--------------------------------------|
      1. |  317.6         19   623.30527   1935 |
      2. |  391.8         18   636.16667   1936 |
      3. |  410.6         17    649.4353   1937 |
      4. |  257.7         16   673.91875   1938 |
      5. |  330.8         15   696.79334   1939 |
         |--------------------------------------|
      6. |  461.2         14   713.62143   1940 |
      7. |    512         13   729.13077   1941 |
      8. |    448         12   752.55834   1942 |
      9. |  499.6         11   775.55455   1943 |
     10. |  547.5         10      798.36   1944 |
         |--------------------------------------|
     11. |  561.2          9   824.71111   1945 |
     12. |  688.1          8   841.78751   1946 |
     13. |  568.9          7   880.77143   1947 |
     14. |  529.2          6   939.36667   1948 |
     15. |  555.1          5     1016.22   1949 |
         |--------------------------------------|
     16. |  642.9          4     1109.55   1950 |
     17. |  755.9          3   1227.4333   1951 |
     18. |  891.2          2     1395.55   1952 |
     19. | 1304.4          1      1486.7   1953 |
     20. | 1486.7          .           .   1954 |
         +--------------------------------------+
    The count variable is just to underline how many values are included.

    You don't mention a panel structure but my guess is that you have one. If you don't have one, then you don't need a by() option.

    To include the year in question, the syntax is just int(year 0 .)

    Comment


    • #3
      Thank you for the fast answer! This is what I was searching for

      Comment


      • #4
        Reversing time

        Code:
        . gen negyear = -year
        
        . bysort company (negyear) : gen double mean = sum(invest) / sum(invest < .)
        
        . sort company year
        
        . l invest mean year if company == 1
        If there are no missing values we could use _n as the denominator.

        Comment

        Working...
        X