Newey West in pooled cross-section

Nina Lindt

Join Date: Sep 2018

Posts: 5
#1

Newey West in pooled cross-section

27 Sep 2018, 08:38

Hi everyone,

I am analyzing the return and volatility of companies on specific dates (company events). I am using Stata Version 12.1 and I have a dataset with ~1000 observations and 30 variables. My variables Vola and Return represent the return and volatility of the stock price of companies on a specific date. The company code ist the variable cusip. The dates on that information are from 2000-2014. The date variable is not continuos, which means that there a large gaps and additionally, I have information on Return and Vola on the same date for 2 or more different companies.

Based on the value of Vola I have formed 5 portfolios with the returns. So as a result I have 5 Portfolios with the mean return from all companies belonging to that Portfolio.
Now I want to test if the difference between the mean return of Portfolio 1 and 5 is significant. To test the significance of difference, I have to use Newey West Standard Errors because of autocorrelation and homoscedasticity. So I do not want to use ANOVA or Kruskal-Wallis here, but Newey West standard errors. The problem is that before using the command "newey" I have to use "tsset" to specify the time variable. The Problem with using "tsset" is that my data is pooled cross-section (company code cusip is unique, but dates are with gaps and repeated).

What I have done so far is:

Code:

gene t = _n

Code:

tsset t

Code:

newey Return Portfolio, lag(4)

Is it okay to specify t as the time variable in my case? How can I test the significance of the difference in the mean return between portfolio 1 and 5?
Thank you very much in advance!

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input double Date str8 cusip float(Vola Return Portfolio) 14749 "72581M10" .9122283 .0890625 5 14754 "15231910" .9651068 .20394737 5 14755 "83591610" .9569203 1.195652 5 14761 "62937750" .2449661 .10833333 3 14761 "32051K10" .8661846 0 5 14775 "05106U10" .6686476 .265625 5 14777 "24823Q10" .8361632 -.03125 5 14782 "15670R10" .6859892 .5 5 14784 "74758R10" .7160541 .10267857 5 14784 "15986410" .7598475 .375 5 14790 "14067D10" .52998465 1.9990234 5 14791 "60741U10" .6971611 .08333334 5 14804 "86769Y10" .7487978 1.6875 5 14804 "64121A10" .9361812 .7058824 5 14804 "44973Q10" .7487978 .3871528 5 14805 "69562K10" .7451704 .8229167 5 14805 "68212810" .7737479 1.6153846 5 14810 "04017510" .789129 .5625 5 14811 "92231M10" .55427986 .3489583 5 end format %d Date
Tags: None
Nina Lindt

Join Date: Sep 2018

Posts: 5
#2

27 Sep 2018, 18:58

Can anyone please help me? Or do you need additional Information?

I try to summarize my problem. I want to calculate: mean return of Portfolio 5 - mean return of Portfolio 1. I want to test if this difference is significant using Newey West adjusted Standard Errors.

Thank you very much in Advance,

Nina
Comment
Attaullah Shah

Join Date: Aug 2014

Posts: 1669
#3

28 Sep 2018, 00:06

First, I cannot see portfolio 1 in the example dataset that you have provided in post 1. Second, for the newey command to work, there should be sufficient number of observations. Therefore, the difference you are going to create from the means of the two portfolios should have a time series of returns. Keeping in view these two points, you need to do something like this

Code:

* Find average periodic return for each portfolio bys Date : egen ret5 = mean(Return / (Portfolio ==5)) bys Date : egen ret1 = mean(Return / (Portfolio ==1)) * Find difference gen dif = ret5 -ret1 * newey newey dif, lag(4)

Regards
--------------------------------------------------
Attaullah Shah, PhD.
Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
FinTechProfessor.com
https://asdocx.com
Check out my asdoc program, which sends outputs to MS Word.
For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.
Comment

Nina Lindt

Join Date: Sep 2018
Posts: 5

28 Sep 2018, 05:43

Thank you very much for your Response Mr Shah.
In my example I sorted the observations on Date and Portfolio, so that you could only see Portfolio 5. Here is a more representative example from my dataset.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double Date str8 cusip float(Vola Return Portfolio)
17366 "00950V10" .27122802  .024285715 4
17372 "45248L30" .26707476       -.042 4
17373 "55002110"  .3089233    .5555556 4
17379 "25659P40" .23309126   .22206897 3
17380 "86490910"  .2117216  .066086955 3
17380 "92827P10"  .2152245  -.15285714 3
17380 "20605P10" .25087494   .12173913 3
17386 "24802R50"  .3004555  -.14090909 4
17387 "44069430"  .2554734  .033333335 4
17436 "26433B10"   .183361     .146875 2
17443 "56509R10" .23802303       .1125 3
17449 "92769R10" .26415515         .05 4
17463 "90384S30" .12656954    .6566667 1
17463 "74731Q10" .23037523   .11611111 3
17464 "73929910"  .2522463   .07181818 4
17471 "63009F10" .18183444   .14285715 2
17471 "83609110"  .3024109       .0025 4
17471 "64128B10"  .3484251    .4485714 4
17471 "24784L10"   .361714 -.002777778 5
17477 "03834A10"  .2531909   .04916667 4
end
format %d Date

Did I understand it right that in my actual case I cannot use -newey- to generate Newey West Standard Errors because I have no continuos time series? Or is it possible to use -newey- although there are gaps in my dates and several obeservations on the same date (like 02aug2007 in my example)?
Is there any way to create

Newey West Standard Errors in my case?

Thank you very much in Advance,

Nina

Comment

Kate Lussy

Join Date: Apr 2019

Posts: 42
#5

29 Apr 2019, 11:39

Hi Nina, I was wondering if you figured out how to do it in your case? I am struggling with the same topic and your insights would really help!
Comment
Kate Lussy

Join Date: Apr 2019

Posts: 42
#6

01 May 2019, 05:05

Originally posted by Nina Lindt View Post

Hi everyone,

I am analyzing the return and volatility of companies on specific dates (company events). I am using Stata Version 12.1 and I have a dataset with ~1000 observations and 30 variables. My variables Vola and Return represent the return and volatility of the stock price of companies on a specific date. The company code ist the variable cusip. The dates on that information are from 2000-2014. The date variable is not continuos, which means that there a large gaps and additionally, I have information on Return and Vola on the same date for 2 or more different companies.

Based on the value of Vola I have formed 5 portfolios with the returns. So as a result I have 5 Portfolios with the mean return from all companies belonging to that Portfolio.
Now I want to test if the difference between the mean return of Portfolio 1 and 5 is significant. To test the significance of difference, I have to use Newey West Standard Errors because of autocorrelation and homoscedasticity. So I do not want to use ANOVA or Kruskal-Wallis here, but Newey West standard errors. The problem is that before using the command "newey" I have to use "tsset" to specify the time variable. The Problem with using "tsset" is that my data is pooled cross-section (company code cusip is unique, but dates are with gaps and repeated).

What I have done so far is:

Code:

gene t = _n

Code:

tsset t

Code:

newey Return Portfolio, lag(4)

Is it okay to specify t as the time variable in my case? How can I test the significance of the difference in the mean return between portfolio 1 and 5?
Thank you very much in advance!

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input double Date str8 cusip float(Vola Return Portfolio) 14749 "72581M10" .9122283 .0890625 5 14754 "15231910" .9651068 .20394737 5 14755 "83591610" .9569203 1.195652 5 14761 "62937750" .2449661 .10833333 3 14761 "32051K10" .8661846 0 5 14775 "05106U10" .6686476 .265625 5 14777 "24823Q10" .8361632 -.03125 5 14782 "15670R10" .6859892 .5 5 14784 "74758R10" .7160541 .10267857 5 14784 "15986410" .7598475 .375 5 14790 "14067D10" .52998465 1.9990234 5 14791 "60741U10" .6971611 .08333334 5 14804 "86769Y10" .7487978 1.6875 5 14804 "64121A10" .9361812 .7058824 5 14804 "44973Q10" .7487978 .3871528 5 14805 "69562K10" .7451704 .8229167 5 14805 "68212810" .7737479 1.6153846 5 14810 "04017510" .789129 .5625 5 14811 "92231M10" .55427986 .3489583 5 end format %d Date

Hi Nina, I was wondering if you figured out how to do it in your case? I am struggling with the same topic and your insights would really help!
Comment

Announcement