Hi all,
I am trying to calculate standard deviation of returns in stata, and have come up with the following code:
Code:
rangestat (mean) return, interval (n_obs 0 3047)
gen diff=return - return_mean
gen diff2 = (diff^2)
by company_id (bcal_date): gen sumdiff2 = sum(diff2)
gen var= (sumdiff2/3047)
gen sd= sqrt(var)
I am using this as opposed to the following, as when I enter the L numbers, I do not understand why but only observations after the latter L number show the value for variance:
Code:
// CALCULATE SOME RUNNING TOTALS OF RET AND RET^2*
by company_id (bcal_date), sort: gen sum_ret = sum(return)
by company_id (bcal_date): gen sum_ret_sq = sum(return^2)
// AND A RUNNING COUNT OF NON MISSING OBSERVATIONS*
by company_id (bcal_date): gen int n_obs = sum(!missing(return))
sort company_id bcal_date
// NOW CALCULATE RUNNING STANDARD DEVIATIONS
gen variance = (L1.sum_ret_sq-L3048.sum_ret_sq)/(L1.n_obs-L3048.n_obs)-((L1.sum_ret - L3048.sum_ret)/(L1.n_obs-L3048.n_obs))^2
gen sd = sqrt(variance)
I wanted to ask if my version of the code will also correctly let me reach the standard deviation I am looking for, or if someone could explain what the L numbers mean and how I can input them in the second code I showed, to allow all values of variance and ultimately standard deviation to appear.
Thanks so much!
Cen
I am trying to calculate standard deviation of returns in stata, and have come up with the following code:
Code:
rangestat (mean) return, interval (n_obs 0 3047)
gen diff=return - return_mean
gen diff2 = (diff^2)
by company_id (bcal_date): gen sumdiff2 = sum(diff2)
gen var= (sumdiff2/3047)
gen sd= sqrt(var)
I am using this as opposed to the following, as when I enter the L numbers, I do not understand why but only observations after the latter L number show the value for variance:
Code:
// CALCULATE SOME RUNNING TOTALS OF RET AND RET^2*
by company_id (bcal_date), sort: gen sum_ret = sum(return)
by company_id (bcal_date): gen sum_ret_sq = sum(return^2)
// AND A RUNNING COUNT OF NON MISSING OBSERVATIONS*
by company_id (bcal_date): gen int n_obs = sum(!missing(return))
sort company_id bcal_date
// NOW CALCULATE RUNNING STANDARD DEVIATIONS
gen variance = (L1.sum_ret_sq-L3048.sum_ret_sq)/(L1.n_obs-L3048.n_obs)-((L1.sum_ret - L3048.sum_ret)/(L1.n_obs-L3048.n_obs))^2
gen sd = sqrt(variance)
I wanted to ask if my version of the code will also correctly let me reach the standard deviation I am looking for, or if someone could explain what the L numbers mean and how I can input them in the second code I showed, to allow all values of variance and ultimately standard deviation to appear.
Thanks so much!
Cen
Comment