Robust standard errors - OLS - Right-skewed distribution

Rolf Er Ren

Join Date: Oct 2019

Posts: 5
#1

Robust standard errors - OLS - Right-skewed distribution

16 Oct 2019, 05:32

Hi,

I am looking at data for returns of companies after their initial public offering. The return-data is skewed to the right and my White's test for heteroskedasticity suggests that it is highly heteroskedastic and skewed (See picture below).

I have tried using robust standard errors in my regression (regress x y, robust) which alters my results. However, I do not know what the robust standard errors take into account.

Do you know if they account for the skewness of the distribution? If no, is there any way to take this into account when making statistical inference?

I hope you can help!

Best,
Rolf

.
Tags: data, OLS, regression, robust, skewness
Nick Cox

Join Date: Mar 2014

Posts: 35713
#2

16 Oct 2019, 05:40

If you show us the results of

Code:

scatter x y

then we would have a picture of what is going on. It seems more likely to me that you should reconsider your model functional form than that anything much will be solved by getting different standard errors. In any case, it is the conditional distributions that matter, not the marginal distribution.

Note: conventionally the response or outcome is called y and the predictor x. I am just echoing what you say you used as regress syntax.
Comment
Rolf Er Ren

Join Date: Oct 2019

Posts: 5
#3

16 Oct 2019, 05:51

Thx for the quick response!

Return is the dependent variable (endogenous) and I have, including dummies, 22 explanatory variables (exogenous).

These are my plotted, non-robust, standard errors generated with the code:

redict resid, r

predict yhat, xb

scatter yhat resid

Does this make any sense?

Attached Files
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35713
#4

16 Oct 2019, 06:03

So the mention of regress x y, robust was nonsensical, or at least not to be taken literally. How were we supposed to know?

It's conventional to plot residuals versus fitted, not as you have it here. Either way, a constraint on the response is a constraint on the configuration of your plot.

It seems that your regression takes no account whatever of the bounded character of your response. Fitting a hyperplane looks suspect to me in that situation, but it is hard to give more constructive advice. Again, the most that a robust option can do is give, as it were, more honest standard errors and perhaps less misleading tests and confidence intervals. it can't correct a dubious model.
Comment
Rolf Er Ren

Join Date: Oct 2019

Posts: 5
#5

16 Oct 2019, 06:30

Sorry for my poor understanding of the topic - I appreciate you taking the time!

I have tried to be very specific about my steps in the code. I know for a fact that my dependent variable, "Threeyearreturn", is skewed towards the rights, but naively choose to use OLS, as I do not know what model I can use instead - perhaps you do?

Below you can find my output

Continuing with OLS;
I start out with a regression using non-robust standard errors and test for heteroskedasticity which is rejected across multiple tests. Therefore, I use a regression with robust standard errors. The variable "PEBacked", which is the one of interest, goes from being significantly different from 0 at a 5-percent level (using a t-test), to not being significantly different from 0 in the regression using robust standard errors.

Do you have any recommendation on how to circumvent this by using a different functional form etc?

Attached Files
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#6

16 Oct 2019, 06:40

your situation is still not clear to me (and you should read the FAQ as much of your posting is unreadable); I suggest you see the following Stata blog:

https://blog.stata.com/2011/08/22/us...tell-a-friend/
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2470
#7

16 Oct 2019, 06:41

Hi Rolf
As you have observed from your results, and as Nick Cox already mentioned, using Robust standard errors simply recalculates the standard errors of the estimated coefficients using a more conservative estimation of the standard errors under the assumption of heteroskedasticity. On the other hand, robust standard errors are only asymptotically.
Since you have 260obs in your model, robust standard errors may not be the best approach.
Perhaps something you can try Weighted Least Squares. (You can find the explanation for this in chapter 8 from Introductory Econometrics: A modern approach by Wooldridge)
HTH
Comment
Rolf Er Ren

Join Date: Oct 2019

Posts: 5
#8

16 Oct 2019, 07:04

Thanks for the response. I will look into WLS.

Have a great day!
Comment
Brad Anderson

Join Date: Sep 2014

Posts: 70
#9

16 Oct 2019, 07:51

Nick raised the question about whether this was the correct functional form. I'm asking this naively since I don't know any of the theory on which this is based. But is the original distribution zero-limited with a lot of observations at or near the lower limit? And conceptually, would you expect the importance of a 1 unit change in return (your outcome) to be equal across the range of return? E.g., is the difference between 0 and 1 of equal importance to the difference between 90 and 91, for example. A tangible example would be income: The tangible difference between $20k, and $21k is likely much more meaningful than the difference between $100k, and $101k. If not, then you probably should be considering a different model.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35713
#10

16 Oct 2019, 08:31

There must be a literature on this, with hundreds if not thousands of papers on returns as the outcome of interest. I have no idea what that literature is -- I am, or used to be, a geomorphologist, although I strayed. But surely a researcher should be looking at it.

But if returns are bounded below by -1 and not bounded above, then I would expect a suitable link function to be log(1 + return).
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#11

17 Oct 2019, 08:40

I would also suggest that it sure looks like predicted values and residuals are associated which raises issues about the estimator. As Nick points out, this may come from not being able to lose more your full investment but I suspect it may be more than that. A different functional form is one option. A tobit type model is another.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#12

17 Oct 2019, 09:07

Rolf:
as an aside to previous excellent points, I would also considering a more parsimonious regression model: 20 predictors with 266 observations sounds like torturing your data.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Robust standard errors - OLS - Right-skewed distribution

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment