Moving average with panel data

Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#1

Moving average with panel data

03 Jun 2022, 16:41

Hello all,

Below is an example of the data I am working with:

Code:

input float(year quarter miles) 2000 1 5 2000 1 2 2000 1 7 2000 2 3 2000 2 6 2000 3 8 2000 3 9 2000 3 2 2000 3 1 2000 3 7 2000 4 8 2000 4 9 2000 4 4 2000 4 5 2000 4 3 2000 4 4 2001 1 2 2001 1 2 2001 2 3 2001 2 4 2001 3 5 2001 3 6 2001 3 4 2001 4 6 2001 4 3 2001 4 2 2001 4 5 2002 1 3 2002 1 7 2002 1 7 2002 1 4 2002 2 4 2002 2 6 2002 3 7 2002 3 5 2002 4 5 2002 4 3 2002 4 2

I would like to generate a new variable called "moving_average" that creates a moving average for the miles variable by year and quarter. For instance, 2000Q1 should have its own moving average, 2000Q2 should have its own moving average, etc. I would appreciate any assistance with this!

Thanks,
Anoush
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#2

03 Jun 2022, 16:58

I don't understand what you want. A moving average has a window specifying how many lagged and leading observations are included around each observation (and whether the current observation is included). You say nothing about that. Perhaps you mean you want a running average in each year? That would start with the first observation of a year being its own running average. Then for the second observation we take the average of the first two observations. Then for the third observation the average of the first three, etc. until we reach the end of the year. If that's what you meant:

Code:

sort year quarter, stable by year: gen running_average = sum(miles)/_n
Comment
Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#3

03 Jun 2022, 17:16

Clyde Schechter thank you for your response. I apologize, I must have misunderstood the meaning of moving average.

I suppose for the moving average I would like to use two lagged observations, the current observation, and two leading observations. Is there a way to calculate this? I would appreciate any help!

Anoush
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30356

03 Jun 2022, 18:29

Yes. It's just a little more complicated:

Code:

gen int qdate = yq(year, quarter)
format qdate %tq

sort qdate, stable
gen long seq = _n
tsset seq

levelsof year, local(years)
gen moving_average = .
foreach y of local years {
    tssmooth ma temp = miles if year == `y', window(2 1 2)
    replace moving_average = temp if year == `y'
    drop temp
}

Comment

Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#5

03 Jun 2022, 19:16

Thank you so much, Clyde Schechter. You are always so helpful!

Anoush
Comment

Nick Cox

Join Date: Mar 2014
Posts: 36058

04 Jun 2022, 02:06

There are two important wrinkles here.

First, there are unequal numbers of observations in the example at each date, which suggests that a (better) alternative is to take means before smoothing. That can be done with a collapse, then a smooth, then a merge.

Second, equal weights as in

(1/5) * value at t - 2 + (1/5) * value at t - 1 + (1/5) * value at t + (1/5) * value at t + 1 + (1/5) * value at t + 2

have no virtue beyond extreme simplicity.

There are many alternatives of which the method called Hanning by John W. Tukey: (1/4) previous + (1/2) this + (1/4) next) is simple enough, and can extended to any odd number of weights by convolution, so yielding weights in the proportions

1 2 1
1 4 6 4 1
1 5 10 10 5 1

and so on. These are just binomial coefficients and these smoothers are often called binomial filters or smoothers. Just about any time series book has a chapter on smoothing before proceeding to allowed principles of witchcraft such as ARIMA modelling. The story starts with

1. If the aim of smoothing is to summarize values around here then values near here carry more information than those further away.

2. A smoother needs to be thought of in terms of what happens in the frequency domain and Hanning and its binomial siblings are good enough for most fairly simple purposes.

Here is that done for the data in #1 with a bonus: I show how a spike is smoothed, thus exposing the weights. to view.

Code:

clear

input float(year quarter miles)
2000 1 5
2000 1 2
2000 1 7
2000 2 3
2000 2 6
2000 3 8
2000 3 9
2000 3 2
2000 3 1
2000 3 7
2000 4 8
2000 4 9
2000 4 4
2000 4 5
2000 4 3
2000 4 4
2001 1 2
2001 1 2
2001 2 3
2001 2 4
2001 3 5
2001 3 6
2001 3 4
2001 4 6
2001 4 3
2001 4 2
2001 4 5
2002 1 3
2002 1 7
2002 1 7
2002 1 4
2002 2 4
2002 2 6
2002 3 7
2002 3 5
2002 4 5
2002 4 3
2002 4 2
end

tab year quarter

save anoush, replace

collapse miles, by(year quarter)  
gen qdate = yq(year, quarter)
tsset qdate

gen spike = _n == floor(_N/2)
tssmooth nl miles_s=miles, smoother(HH)
tssmooth nl spike_s=spike, smoother(HH)

list, sepby(year)

merge 1:m year quarter using anoush

Code:

    +----------------------------------------------------------------+
     | year   quarter      miles   qdate   spike    miles_s   spike_s |
     |----------------------------------------------------------------|
  1. | 2000         1   4.666667     160       0   4.666667         0 |
  2. | 2000         2        4.5     161       0       4.85         0 |
  3. | 2000         3        5.4     162       0   4.941667         0 |
  4. | 2000         4        5.5     163       0     4.4125     .0625 |
     |----------------------------------------------------------------|
  5. | 2001         1          2     164       0       3.65       .25 |
  6. | 2001         2        3.5     165       1    3.65625      .375 |
  7. | 2001         3          5     166       0   4.203125       .25 |
  8. | 2001         4          4     167       0    4.59375     .0625 |
     |----------------------------------------------------------------|
  9. | 2002         1       5.25     168       0    4.90625         0 |
 10. | 2002         2          5     169       0   5.145833         0 |
 11. | 2002         3          6     170       0   4.703125         0 |
 12. | 2002         4   3.333333     171       0   3.333333         0 |
     +----------------------------------------------------------------+

More discussion at

Code:

. search bsmplot, sj historical

Search of official help files, FAQs, Examples, and Stata Journals

SJ-4-4 gr22_1 . . . . . . . . . . . . . . . . . Software update for bsmplot
(help bsmplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
Q4/04 SJ 4(4):490
binomial smoothing plot program rewritten so that it now
produces Stata 8 graphs

STB-35 gr22 . . . . . . . . . . . . . . . . . . . . Binomial smoothing plot
(help bsmplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
1/97 pp.7--9; STB Reprints Vol 6, pp.36--38
produce a plot of both yvar and the result of smoothing yvar
by a binomial filter against xvar

Last edited by Nick Cox; 04 Jun 2022, 02:09.

Comment

Anoush Khachatryan

Join Date: Sep 2021

Posts: 56
#7

06 Jun 2022, 12:51

Nick Cox thank you so much! Very helpful!
Comment

Announcement