Calculating variance in stata

Cen Sophia

Join Date: Jul 2023

Posts: 9
#1

Calculating variance in stata

28 Jul 2023, 05:21

Hi,

I have followed Clyde's code here to input variance and eventually standard deviation in my data:

Code:
// CALCULATE SOME RUNNING TOTALS OF RET AND RET^2*

by company_id (bcal_date), sort: gen sum_ret = sum(return)
by company_id (bcal_date): gen sum_ret_sq = sum(return^2)

// AND A RUNNING COUNT OF NON MISSING OBSERVATIONS*
by company_id (bcal_date): gen int n_obs = sum(!missing(return))

sort company_id bcal_date
// NOW CALCULATE RUNNING STANDARD DEVIATIONS
gen variance = (L1.sum_ret_sq - L3047.sum_ret_sq)/(L1.n_obs-L3047.n_obs)-((L1.sum_ret - L3047.sum_ret)/(L1.n_obs-L3047.n_obs))^2
gen sd = sqrt(variance)

However, I cannot seem to get the L1 numbers correct. I am unsure which number to place here, I used the largest value of n_obs, which is 3047 for me, but it does not work. Anyone have any advice? It would be much appreciated!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35709
#2

28 Jul 2023, 05:46

To get a panelwise SD, I would use egen, sd(). As you are really interested in SD, there is no need to square to get the variance. .

To get running SD, I would tend to reach for rangestat from SSC or rolling.

If 3047 is your panel length, the result is missing for L3047. Consider that if 2 were your sample size, then L1 yields the value for observation 1 within observation 2 and L2 yields missing
Comment
Cen Sophia

Join Date: Jul 2023

Posts: 9
#3

28 Jul 2023, 06:00

Hi Nick,

Thanks so much for your response.

My plan after finding the standard deviation is to estimate a garch model. Would it be okay to just have the standard deviation to do this? I apologise if this comes across as a sillyquestion, I am quite new to stata.

Also, would you haveany suggestiong for the code needed for the rolling/rangestat command?

Thank you!
Comment
Cen Sophia

Join Date: Jul 2023

Posts: 9
#4

28 Jul 2023, 06:30

Hi Nick,

I wanted to ask if you could explain further what you meant about 3047 being missing. Would the code need to be:

Code:
gen variance = (L1.sum_ret_sq-L2.sum_ret_sq)/(L1.n_obs-L2.n_obs)-((L1.sum_ret - L2.sum_ret)/(L1.n_obs-L2.n_obs))^2

I tried the rangestat command for sd, and it gave me slightly different numbers to the few that the above code produced, so potentially it is better to stick with this code?
Comment

Announcement

Calculating variance in stata

Comment

Comment

Comment