Calculate a variable based on rolling windows in panel

Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#1

Calculate a variable based on rolling windows in panel

08 Aug 2014, 15:59

Hi,

I have an unbalanced panel and I need to compute variable B in year t as the variance of variable A in t-1, t-2, and t-3. In other words, I need to calculate a variable based on rolling windows of the past three years.

I believe my starting point should be the following:

tsset ID year
by ID: egen B = var(A)

However, I need to ask Stata to take into account the values of variable A only for t-1, t-2, and t-3. Any help would be much appreciated.

Thanks,

Giuseppe
Tags: panel, panel data
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#2

10 Aug 2014, 06:32

You should explain the provenance of user-written commands. There is no egen var() function in built-in Stata; you appear to be using the function available via the egenmore package, downloadable from SSC. Whatever, I am unsure that I understand what you're wanting to do, but how about doing something exploiting the fact that you have tsset data:

ge L1B = L1.B
ge L2B = L2.B
ge L3B = L3.B

This will create variables containing the lagged variances. If there are no missing data, then the variance for the pooled data (pooling from the 3 lags) can be computed from the variances of each of L1B, L2B, and L3B, can't it? The computation is more complicated with an unbalanced panel because the calculation of the total variance from the component variance needs to take account of the correct number of observations available at each lag. But deft use of the lag operator with generate should allow you to create variables containing those numbers.
Comment
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#3

10 Aug 2014, 07:01

Have you considered -rolling- ? Something like:

Code:

clear all set more off *----- example data ----- input /// id metric 1 23 1 34 1 235 1 4663 1 4562 1 366 1 485 2 34 2 455 2 235 2 2453 2 744 2 635 2 646 end bysort id: gen year = cond(id==1, 1980 + _n, 1985 + _n) order id year list, sepby(id) *----- what you want? ----- xtset id year rolling varian = r(Var), window(3) clear: summarize metric list, sepby(id)

See -help rolling- for details.

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
Comment
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#4

10 Aug 2014, 11:01

If you need to keep the original data, then you can -merge- the results of -rolling- back to the original dataset (I did in the way you specified in your original post):

Code:

clear all set more off *----- example data ----- input /// id metric 1 23 1 34 1 235 1 4663 1 4562 1 366 1 485 2 34 2 455 2 235 2 2453 2 744 2 635 2 646 end bysort id: gen year = cond(id==1, 1980 + _n, 1985 + _n) order id year list, sepby(id) tempfile orig save "`orig'" *----- what you want? ----- xtset id year rolling varian = r(Var), window(3) clear: summarize metric gen year = end + 1 merge 1:1 id year using "`orig'" sort id year order id year list, sepby(id)

For sure there are more "direct" ways of doing this. This is just one.

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#5

11 Aug 2014, 04:39

Thanks Stephen and Roberto!

I actually tried the rolling command and it works just fine if I want to calculate S.D. However, it creates missing values when I tried to calculate variance.

In the case of the variance it is not actually a big deal since I simply compute Variance = S.D. ^2.

tsset ID year

rolling r(sd), window(3) clear: summarize A

rename _stat_1 B_old

rename end year

generate B= B_old^2

drop B_old

The problem arises when I want to calculate the kurtosis or the skewness. Alike for the variance, STATA computes me a new variable of missing values. Any idea? This is the code I use:

tsset ID year

rolling r(kurt), window(3) clear: summarize A

rename _stat_1 B

rename end year
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35694
#6

11 Aug 2014, 06:03

As the help documents, summarize saves the kurtosis in r(kurtosis). A reference to r(kurt) is not illegal, but that is not defined.

That aside, I can't see very much value in kurtosis calculated from subsamples of 3. When the value is not indeterminate, it appears to be always 1.5. It would be an amusing derivation to show why that is so.
(EDIT: Results cited in http://www.stata-journal.com/sjpdf.h...iclenum=st0204 give an upper limit of 1.5, but that is only part of the question.)

See also Section 18 of the FAQ Advice.

(You don't give any code showing how you tried to calculate the variance, so diagnosis is difficult, given my limited powers of http://en.wikipedia.org/wiki/Extrasensory_perception But I guess at a similar error.)

Last edited by Nick Cox; 11 Aug 2014, 06:30.
Comment
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#7

11 Aug 2014, 06:24

Originally posted by Nick Cox View Post

(You don't give any code showing how you tried to calculate the variance, so diagnosis is difficult, given my limited powers of http://en.wikipedia.org/wiki/Extrasensory_perception But I guess at a similar error.)

Probably Giuseppe is using -r(var)- and not -r(Var)-.

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#8

11 Aug 2014, 11:38

Thanks so much Nick and Roberto!

Roberto, I was indeed unsing -r(var)-. With r(Var) works just fine. Thanks!

I will now try to use r(kurtosis). Thanks for your hint and readings Nick!

Thanks agin!

Giuseppe

Last edited by Giuseppe Criaco; 11 Aug 2014, 11:40.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35694
#9

11 Aug 2014, 11:43

#6 implies that kurtosis from samples of 3 tells you almost nothing about the data. The only exception is that 3 identical values will give you indeterminate kurtosis.
Comment
Giuseppe Criaco

Join Date: Aug 2014

Posts: 42
#10

12 Aug 2014, 01:26

Thanks Nick,

I have tried to run the command for 5 years with r(kurtosis). However, it does not work (new variable with all missing values). This is the code I have used:

rolling r(kurtosis), window(5) clear: summarize A

May that be because the summarize command says for kurtosis:

r(kurtosis) kurtosis (detail only)

Thanks again!

Giuseppe
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35694
#11

12 Aug 2014, 01:43

You got it. You must specify the detail option.
Comment

Announcement

Calculate a variable based on rolling windows in panel

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment