Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating the average of previous observations in a panel

    Dear forum users,

    I have a question that I cannot seem to figure out by myself and I can't find an answer on the internet.

    I want to look at the average of wind speeds over the five previous and how it affects wind turbine construction.
    My data currently looks like this:

    dist_code year wind_speed
    3154 1995 3.7
    3154 1996 3.728947
    3154 1997 4.152822
    3154 1998 4.138276
    3154 1999 3.912632
    3154 2000 3.830851
    3154 2001 3.751
    3154 2002 4.137689
    3155 1995 2.952615
    3155 1996 2.832849
    3155 1997 3.146398
    3155 1998 3.342508
    3155 1999 3.167976
    3155 2000 3.104127
    3155 2001 2.838462
    3155 2002 2.686159



    However, I would like to have a fourth column, showing the average wind speed over the last 5 years, so something like:
    dist_code year wind_speed wind_speed_prev_5yr_avg
    3154 1995 3.7
    3154 1996 3.728947
    3154 1997 4.152822
    3154 1998 4.138276
    3154 1999 3.912632
    3154 2000 3.830851 3.884
    3154 2001 3.751 3.91
    3154 2002 4.137689 3.82
    3155 1995 2.952615
    3155 1996 2.832849
    3155 1997 3.146398
    3155 1998 3.342508
    3155 1999 3.167976
    3155 2000 3.104127 3.084
    3155 2001 2.838462 3.015
    3155 2002 2.686159 2.97


    Is there an easy way to do this?

    Thank you very much in advance.
    Last edited by Victor Smid; 29 Mar 2019, 09:43.

  • #2
    Yes; this is easy. With data like yours, you could just use sum() and calculate the difference between the cumulative sum and itself five years earlier (and then divide by 5).

    But you would need to keep track of missing values and gaps if data were not so well behaved as your example.

    But it can be easier. Here I use rangestat (SSC), which is smart about complications like that -- and that's not the only other way to do it.

    Naturally you are at liberty to discard results from incomplete windows, which I don't ever want to do before seeing them.

    Please note here that (in effect) I used dataex, as you are asked to do (FAQ Advice #12).

    Code:
    clear
    input dist_code year wind_speed
    3154 1995 3.7
    3154 1996 3.728947
    3154 1997 4.152822
    3154 1998 4.138276
    3154 1999 3.912632
    3154 2000 3.830851
    3154 2001 3.751
    3154 2002 4.137689
    3155 1995 2.952615
    3155 1996 2.832849
    3155 1997 3.146398
    3155 1998 3.342508
    3155 1999 3.167976
    3155 2000 3.104127
    3155 2001 2.838462
    3155 2002 2.686159
    end
    
    rangestat (count) wind_speed (mean) wind_speed, int(year -5 -1) by(dist)
    
    list, sepby(dist)
    
        +---------------------------------------------------+
         | dist_c~e   year   wind_s~d   wind_s~t   wind_sp~n |
         |---------------------------------------------------|
      1. |     3154   1995        3.7          .           . |
      2. |     3154   1996   3.728947          1         3.7 |
      3. |     3154   1997   4.152822          2   3.7144735 |
      4. |     3154   1998   4.138276          3   3.8605897 |
      5. |     3154   1999   3.912632          4   3.9300113 |
      6. |     3154   2000   3.830851          5   3.9265354 |
      7. |     3154   2001      3.751          5   3.9527056 |
      8. |     3154   2002   4.137689          5   3.9571162 |
         |---------------------------------------------------|
      9. |     3155   1995   2.952615          .           . |
     10. |     3155   1996   2.832849          1    2.952615 |
     11. |     3155   1997   3.146398          2    2.892732 |
     12. |     3155   1998   3.342508          3   2.9772874 |
     13. |     3155   1999   3.167976          4   3.0685925 |
     14. |     3155   2000   3.104127          5   3.0884692 |
     15. |     3155   2001   2.838462          5   3.1187716 |
     16. |     3155   2002   2.686159          5   3.1198942 |
         +---------------------------------------------------+
    
    . describe
    
    Contains data
      obs:            16                          
     vars:             5                          
     size:           448                          
    --------------------------------------------------------------------------------------------------------------------------------------------------
                  storage   display    value
    variable name   type    format     label      variable label
    --------------------------------------------------------------------------------------------------------------------------------------------------
    dist_code       float   %9.0g                
    year            float   %9.0g                
    wind_speed      float   %9.0g                
    wind_speed_co~t double  %10.0g                count of wind_speed
    wind_speed_mean double  %10.0g                mean of wind_speed
    --------------------------------------------------------------------------------------------------------------------------------------------------
    Note: Small discrepancies between your results and mine.

    rangestat (SSC) is one keyword for searching the forum for similar questions.
    Last edited by Nick Cox; 29 Mar 2019, 09:57.

    Comment

    Working...
    X