Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Moving average by several groups - no panel

    Dear all,

    I would like to calculate a moving average of a variable x considering five years of age. I want to do this by year and sex. Hence, in the end I will have a smoothed average x for each age in each year by sex. I do NOT want a moving average over five years (2008, 2009...). Please find an example below.

    Some background: My data is provided on the individual level, I want to aggregate it so that I only have aggregated values by cells of age, year, and sex containing the average x. Because the data is so noisy with respect to age, I want to apply a moving average by age.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(sex age year x)
    1 40 2008 .2
    1 41 2008 .4
    0 40 2008 .2
    0 40 2008  1
    0 42 2008 .2
    1 40 2009 .7
    1 40 2009 .8
    1 41 2009 .1
    1 42 2009  1
    0 42 2009  1
    end
    I know there are several posts dealing with "moving average by group" applying rolling or rangestat, but I did not manage to apply these to my example.

    Thanks for your help!

    Best,
    Stephanie

  • #2
    I'm not entirely sure what you are asking for, but I think it is this:

    Code:
    rangestat (mean) x, by(year sex) interval(age -2 2)
    Note: the term "five year moving average" is ambiguous in that you do not say whether you want a five year window centered around the index age, or bounded above, or below, by the index age (or the other possible 5 year windows that include a given age). The above code places the index age at the center of the five year window. Modify accordingly.

    Comment


    • #3
      Thank you Clyde,
      this is exactly what I was after!

      Comment


      • #4
        Code:
        bysort sex year age : egen meanx = mean(x)
        gives you raw means, after which something like

        Code:
        twoway connected meanx age if sex == 0 || connected meanx age if sex == 1, by(year)
        is presumably what you're calling noisy. You're wanting to smooth over 5 year windows. While I like rangestat (a lot) I would almost never want to use equal weights in smoothing a series. I'd use something like a binomial smoother with weights (1 4 6 4 1) / 16. That is easy to get: age is the time variable; pseudo-panels are just (sex year) groups. tssmooth has to me an awkward syntax, so I gave up on that and did it directly.

        This doesn't produce useful results for your sample, which is too small for it to work, but it's a sketch of concept.

        Code:
        collapse x, by(sex year age) 
        egen pspanel = group(sex year),label  
        tsset pspanel age 
        gen x5 = (L2.x + 4 * L1.x + 6*x + 4*F1.x + F2.x)/16 
        list , sepby(pspanel)

        Comment


        • #5
          Thanks Nick!

          Comment

          Working...
          X