Hi everyone,
I am analyzing the return and volatility of companies on specific dates (company events). I am using Stata Version 12.1 and I have a dataset with ~1000 observations and 30 variables. My variables Vola and Return represent the return and volatility of the stock price of companies on a specific date. The company code ist the variable cusip. The dates on that information are from 2000-2014. The date variable is not continuos, which means that there a large gaps and additionally, I have information on Return and Vola on the same date for 2 or more different companies.
Based on the value of Vola I have formed 5 portfolios with the returns. So as a result I have 5 Portfolios with the mean return from all companies belonging to that Portfolio.
Now I want to test if the difference between the mean return of Portfolio 1 and 5 is significant. To test the significance of difference, I have to use Newey West Standard Errors because of autocorrelation and homoscedasticity. So I do not want to use ANOVA or Kruskal-Wallis here, but Newey West standard errors. The problem is that before using the command "newey" I have to use "tsset" to specify the time variable. The Problem with using "tsset" is that my data is pooled cross-section (company code cusip is unique, but dates are with gaps and repeated).
What I have done so far is:
I am analyzing the return and volatility of companies on specific dates (company events). I am using Stata Version 12.1 and I have a dataset with ~1000 observations and 30 variables. My variables Vola and Return represent the return and volatility of the stock price of companies on a specific date. The company code ist the variable cusip. The dates on that information are from 2000-2014. The date variable is not continuos, which means that there a large gaps and additionally, I have information on Return and Vola on the same date for 2 or more different companies.
Based on the value of Vola I have formed 5 portfolios with the returns. So as a result I have 5 Portfolios with the mean return from all companies belonging to that Portfolio.
Now I want to test if the difference between the mean return of Portfolio 1 and 5 is significant. To test the significance of difference, I have to use Newey West Standard Errors because of autocorrelation and homoscedasticity. So I do not want to use ANOVA or Kruskal-Wallis here, but Newey West standard errors. The problem is that before using the command "newey" I have to use "tsset" to specify the time variable. The Problem with using "tsset" is that my data is pooled cross-section (company code cusip is unique, but dates are with gaps and repeated).
What I have done so far is:
Code:
gene t = _n
Code:
tsset t
Code:
newey Return Portfolio, lag(4)
Is it okay to specify t as the time variable in my case? How can I test the significance of the difference in the mean return between portfolio 1 and 5?
Thank you very much in advance!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double Date str8 cusip float(Vola Return Portfolio) 14749 "72581M10" .9122283 .0890625 5 14754 "15231910" .9651068 .20394737 5 14755 "83591610" .9569203 1.195652 5 14761 "62937750" .2449661 .10833333 3 14761 "32051K10" .8661846 0 5 14775 "05106U10" .6686476 .265625 5 14777 "24823Q10" .8361632 -.03125 5 14782 "15670R10" .6859892 .5 5 14784 "74758R10" .7160541 .10267857 5 14784 "15986410" .7598475 .375 5 14790 "14067D10" .52998465 1.9990234 5 14791 "60741U10" .6971611 .08333334 5 14804 "86769Y10" .7487978 1.6875 5 14804 "64121A10" .9361812 .7058824 5 14804 "44973Q10" .7487978 .3871528 5 14805 "69562K10" .7451704 .8229167 5 14805 "68212810" .7737479 1.6153846 5 14810 "04017510" .789129 .5625 5 14811 "92231M10" .55427986 .3489583 5 end format %d Date
Comment