Strange p-values for dummy variables when using robust regression

Niklas Plath

Join Date: Apr 2017
Posts: 4

Strange p-values for dummy variables when using robust regression

23 Apr 2017, 14:16

Dear community,

when I run robust regression on my data, all p-values of my dummy variables suddenly become 0. I have been browsing the web for hours but I can't figure out why this happens.

In the following regression GB_AUS is the daily change in Australian government bond yields from 2007 to 2017,

EV_APP_1 EV_APP_2 EV_APP_3 EV_APP_4 EV_CBPP1_1 EV_CBPP1_2 EV_CBPP2_1 EV_CBPP3_1 EV_LTRO36_1 EV_LTRO36_2 EV_OMT_1 EV_OMT_2 EV_OMT_3 EV_SMP_1 EV_SMP_2 are all binary dummy variables. They have the value "1" on only one day within the sample period and "0" otherwise.

VIX and CESI_AUS are two indices with daily (continuous) data to control for the change of GB_AUS

Running

Code:

regress GB_AUS EV_APP_1 EV_APP_2 EV_APP_3 EV_APP_4 EV_CBPP1_1 EV_CBPP1_2 EV_CBPP2_1 EV_CBPP3_1 EV_LTRO36_1 EV_LTRO36_2 EV_OMT_1 EV_OMT_2 EV_OMT_3 EV_SMP_1 EV_SMP_2 VIX CESI_AUS

results in:


GB_AUS	Coef.	Std. Err.	t	P>t	[95% Conf.	Interval]

EV_APP_1	.0491892	.0660098	0.75	0.456	-.080247	.1786254
EV_APP_2	.0458176	.066025	0.69	0.488	-.0836485	.1752837
EV_APP_3	.0192431	.0659889	0.29	0.771	-.1101521	.1486384
EV_APP_4	-.0670129	.0660237	-1.01	0.310	-.1964765	.0624508
EV_CBPP1_1	.1063229	.0664438	1.60	0.110	-.0239644	.2366102
EV_CBPP1_2	-.0606885	.0663267	-0.91	0.360	-.1907461	.0693692
EV_CBPP2_1	.0902273	.0660072	1.37	0.172	-.039204	.2196586
EV_CBPP3_1	-.0369259	.065995	-0.56	0.576	-.1663332	.0924815
EV_LTRO36_1	.0290509	.0660259	0.44	0.660	-.100417	.1585188
EV_LTRO36_2	-.0645936	.0660567	-0.98	0.328	-.1941219	.0649347
EV_OMT_1	.0101821	.0660002	0.15	0.877	-.1192353	.1395995
EV_OMT_2	.0169853	.0660126	0.26	0.797	-.1124565	.1464271
EV_OMT_3	.0596224	.066035	0.90	0.367	-.0698634	.1891081
EV_SMP_1	.0352793	.0665105	0.53	0.596	-.0951389	.1656975
EV_SMP_2	-.2065437	.0668865	-3.09	0.002	-.3376992	-.0753883
VIX	-.0039786	.0006823	-5.83	0.000	-.0053164	-.0026408
CESI_AUS	.000466	.0001899	2.45	0.014	.0000936	.0008384
_cons	-.0011969	.0012848	-0.93	0.352	-.0037162	.0013224

However, since the White test reveals heteroscedasticity I need to use robust regression: vce(robust)

Code:

regress GB_AUS EV_APP_1 EV_APP_2 EV_APP_3 EV_APP_4 EV_CBPP1_1 EV_CBPP1_2 EV_CBPP2_1 EV_CBPP3_1 EV_LTRO36_1 EV_LTRO36_2 EV_OMT_1 EV_OMT_2 EV_OMT_3 EV_SMP_1 EV_SMP_2 VIX CESI_AUS, VCE(ROBUST)

	Robust
GB_AUS	Coef.	Std. Err.	t	P>t	[95% Conf.	Interval]

EV_APP_1	.0491892	.003023	16.27	0.000	.0432615	.0551169
EV_APP_2	.0458176	.0029652	15.45	0.000	.0400033	.0516319
EV_APP_3	.0192431	.0013502	14.25	0.000	.0165955	.0218908
EV_APP_4	-.0670129	.0023559	-28.44	0.000	-.0716324	-.0623933
EV_CBPP1_1	.1063229	.007175	14.82	0.000	.0922538	.120392
EV_CBPP1_2	-.0606885	.0062763	-9.67	0.000	-.0729954	-.0483815
EV_CBPP2_1	.0902273	.0024017	37.57	0.000	.0855179	.0949367
EV_CBPP3_1	-.0369259	.0015271	-24.18	0.000	-.0399204	-.0339314
EV_LTRO36_1	.0290509	.0024498	11.86	0.000	.0242472	.0338546
EV_LTRO36_2	-.0645936	.0033745	-19.14	0.000	-.0712105	-.0579767
EV_OMT_1	.0101821	.0024198	4.21	0.000	.0054373	.0149269
EV_OMT_2	.0169853	.0024399	6.96	0.000	.0122009	.0217696
EV_OMT_3	.0596224	.0032986	18.07	0.000	.0531542	.0660906
EV_SMP_1	.0352793	.0132084	2.67	0.008	.0093793	.0611792
EV_SMP_2	-.2065437	.0171265	-12.06	0.000	-.2401265	-.172961
VIX	-.0039786	.0010733	-3.71	0.000	-.0060833	-.0018739
CESI_AUS	.000466	.0001718	2.71	0.007	.0001291	.0008029
_cons	-.0011969	.0012886	-0.93	0.353	-.0037237	.0013298

Suddenly, all dummy variables are highly significant. This doesn't make any sense, because there was hardly an effect on the Australian government bond yields on the days when these dummy variables took the value "1".

I'd greatly appreciate your help on how to account for heteroscedasticity (and potentially autocorrelation) without messing up the p-values for my dummy variables.

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17703
#2

23 Apr 2017, 14:45

Niklas:
welcome to the list.
You ran an OLS with robust standard errors (which is different from a robust regression: please, see -rreg-)
Have you checked for multicollinearity via -estat vif-?

Kind regards,
Carlo
(Stata 19.0)
Comment

Niklas Plath

Join Date: Apr 2017
Posts: 4

23 Apr 2017, 17:17

Carlo,
thank you for your response. When I wrote "robust regression", I meant to say OLS with robust standard errors - sorry for the confusion.

I ran the test for multicollineraity - nothing suspicious as far as I can tell

Variable	VIF	1/VIF

VIX	1.05	0.955752
CESI_AUS	1.03	0.970111
EV_SMP_2	1.03	0.973328
EV_SMP_1	1.02	0.984363
EV_CBPP1_1	1.01	0.986342
EV_CBPP1_2	1.01	0.989829
EV_LTRO36_2	1.00	0.997936
EV_OMT_3	1.00	0.998592
EV_LTRO36_1	1.00	0.998867
EV_APP_2	1.00	0.998895
EV_APP_4	1.00	0.998933
EV_OMT_2	1.00	0.999270
EV_APP_1	1.00	0.999356
EV_CBPP2_1	1.00	0.999432
EV_OMT_1	1.00	0.999647
EV_CBPP3_1	1.00	0.999802
EV_APP_3	1.00	0.999989

Mean VIF	1.01

I also ran the Breusch-Pagan / Cook-Weisberg test for heteorskedasticity, as well as the White's general test. Both tests indicate that heteroscadasticity is present:

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of GB_AUS

chi2(1) = 9.41
Prob > chi2 = 0.0022

White's general test statistic : 84.96892 Chi-sq(20) P-value = 5.5e-10

Wouldn't -regress depvar indepvars, vce(robust)- be the correct fix in such a situation?

Last edited by Niklas Plath; 23 Apr 2017, 17:19.

Comment

Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#4

24 Apr 2017, 11:04

Dear Niklas Plath,

My guess is that some (all?) of your dummies have very few (only one?) observations equal to 1. If that is the case, the standard errors are not really meaningful and using the "robust" option just inflates the t-statistics.

Best wishes,

Joao
1 like
Comment
Niklas Plath

Join Date: Apr 2017

Posts: 4
#5

26 Apr 2017, 03:13

Your guess is absolutely correct, Joao. The dummies only have one observation equal to 1 and 2651 observations equal to 0. According to Ford, Jackson and Skinner (2010) and Fomby and Murfin (2005), t-statistics are indeed very much inflated when you use the "robust" option/Newey-West for a a model like mine.

What other options do I have to account for the existing heteroscedasticity and autocorrelation? Since the main contribution of my thesis is the interpretation of the dummies' coefficient estimate, I do need somewhat reliable SE/P-values.

What I have tried since my last post: I bootstrapped the standard error without any success (t-statistic remains inflated). I also included up to 12 lags of my dependent variable in my regression. On average, the first three lags show a coefficient which is statistically significant at p<5%. Despite my corrections, Durbin-Alt and Breusch-Godfrey tests still indicate significant autocorrelation.
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#6

26 Apr 2017, 12:44

Dear Niklas Plath,

Thanks for getting back to us on this. For the record, can you please provide the full references for the papers you mentioned?

The bad news is that essentially there is no good way of doing what you are trying to do. To be more precise, there is a way, but may be too complicated for what you are doing. Feel free to email me to discuss this.

Best regards,

Joao
Comment
Niklas Plath

Join Date: Apr 2017

Posts: 4
#7

28 Apr 2017, 08:33

Dear Joao,

of course:
Ford, Jackson and Skinner (2010) refers to "HAC standard errors and the event study methodology: a cautionary note" published in Applied Economics Letters
Fomby and Murfin (2005) refers to "Inconsistency of HAC standard errors in event studies with i.i.d. errors" published in Applied Financial Economics Letters

Thank you so much for your offer to continue the discussion via email. I will email you in a bit to the address I found on the website of University of Surrey.

Best,
Niklas
Comment
Pedro Loureiro

Join Date: Dec 2022

Posts: 1
#8

05 Dec 2022, 13:10

Hi all. I'm having the exact same problem. Can you please tell me how you have solved this? Either Joao Santos Silva or Niklas Plath
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#9

07 Dec 2022, 01:37

Thank you for contacting me directly; I hope you now know how to proceed, but let me know if I can help.

Best wishes,

Joao
Comment

Announcement

Strange p-values for dummy variables when using robust regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment