Dear all,
I have panel data comprising of firms and daily stock returns. The returns of the firms are called "pch". Assume some arbitrary weights that I call "weight".
Every 21 days from my daily data, I want to compute the monthly covariance between two variables. he variable "id_firm_month" is just a group(firm month) variable. But first I want to exclude calculations on firm-month observations that have less than 10 non-zero and non-missing returns. I type the following command:
The first command runs fine. But the second I ran it for 3 hours and I got no result. The id_firm_month has in total 50,000 values, each containing 21 days.
I have Stata MP/14.1 and desktop with 16GB of RAM and AMD A4 PRO-7300B APU processor so my machine is quite powerful. I run the same command for 20 id_firm_month and I still need to wait 7-8 seconds.
I follow exactly the methodology that other researchers have followed, so in theory it should work (because later I have to do the same thing in a dataset that is 10 times larger).
Please if you think I am doing something in an inefficient way and monthly covariances can be calculated somehow easier let me know.
I have an example of my dataset below, so for these 2 id_firm_month that are observable below, I should get 2 covariances, each for every month.
Thank you in advance,
Dimitris Chlorokostas
I have panel data comprising of firms and daily stock returns. The returns of the firms are called "pch". Assume some arbitrary weights that I call "weight".
Every 21 days from my daily data, I want to compute the monthly covariance between two variables. he variable "id_firm_month" is just a group(firm month) variable. But first I want to exclude calculations on firm-month observations that have less than 10 non-zero and non-missing returns. I type the following command:
Code:
bys id_firm_month: egen int counta = count(id_firm_month) if !missing(pch) & pch!=0 bys id_firm_month: egen cova = corr(pch weight) if counta>9, covariance
I have Stata MP/14.1 and desktop with 16GB of RAM and AMD A4 PRO-7300B APU processor so my machine is quite powerful. I run the same command for 20 id_firm_month and I still need to wait 7-8 seconds.
I follow exactly the methodology that other researchers have followed, so in theory it should work (because later I have to do the same thing in a dataset that is 10 times larger).
Please if you think I am doing something in an inefficient way and monthly covariances can be calculated somehow easier let me know.
I have an example of my dataset below, so for these 2 id_firm_month that are observable below, I should get 2 covariances, each for every month.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str16 firm float(id_firm_month pch weight) int counta "CY00052802151D_w" 17009 .02 .0014250644 19 "CY00052802151D_w" 17009 5.28 .005935295 19 "CY00052802151D_w" 17009 -5.02 .1029575 19 "CY00052802151D_w" 17009 0 .002908295 . "CY00052802151D_w" 17009 3.3 .012112848 19 "CY00052802151D_w" 17009 9.24 .004154707 19 "CY00052802151D_w" 17009 9.24 .072070256 19 "CY00052802151D_w" 17009 -2.79 .017304068 19 "CY00052802151D_w" 17009 1.3 .000342158 19 "CY00052802151D_w" 17009 1.6 .0009975451 19 "CY00052802151D_w" 17009 -7.82 .30016765 19 "CY00052802151D_w" 17009 2.67 .21011737 19 "CY00052802151D_w" 17009 -.98 .024720097 19 "CY00052802151D_w" 17009 -1.29 .008478994 19 "CY00052802151D_w" 17009 -1.55 .05044918 19 "CY00052802151D_w" 17009 1.8 .0004887971 19 "CY00052802151D_w" 17009 -1.87 .14708215 19 "CY00052802151D_w" 17009 1.45 .0006982816 19 "CY00052802151D_w" 17009 -.64 .035314426 19 "CY00052802151D_w" 17009 3.97 .0020358064 19 "CY00052802151D_w" 17009 0 .0002395106 . "CY00052802151D_w" 17010 0 .0002395106 . "CY00052802151D_w" 17010 5.17 .012112848 15 "CY00052802151D_w" 17010 -5.69 .1029575 15 "CY00052802151D_w" 17010 0 .035314426 . "CY00052802151D_w" 17010 2.9 .017304068 15 "CY00052802151D_w" 17010 -1.46 .0004887971 15 "CY00052802151D_w" 17010 .76 .0006982816 15 "CY00052802151D_w" 17010 4.4 .14708215 15 "CY00052802151D_w" 17010 -3.13 .21011737 15 "CY00052802151D_w" 17010 0 .072070256 . "CY00052802151D_w" 17010 3.23 .0014250644 15 "CY00052802151D_w" 17010 0 .005935295 . "CY00052802151D_w" 17010 -3.13 .004154707 15 "CY00052802151D_w" 17010 -3.23 .30016765 15 "CY00052802151D_w" 17010 -2.13 .0020358064 15 "CY00052802151D_w" 17010 3.61 .002908295 15 "CY00052802151D_w" 17010 0 .008478994 . "CY00052802151D_w" 17010 0 .05044918 . "CY00052802151D_w" 17010 3.81 .0009975451 15 "CY00052802151D_w" 17010 -.38 .024720097 15 "CY00052802151D_w" 17010 .7 .000342158 15 end
Thank you in advance,
Dimitris Chlorokostas
Comment