Cluster Standard Errors with rreg

Girish Bahal

Join Date: Sep 2014

Posts: 13
#1

Cluster Standard Errors with rreg

09 May 2016, 17:34

I am using rreg (among other robustness checks) for my results. I however want to use clustered standard errors for rreg since my baseline regressions report clustered errors. I am using the following approach (simplified version is presented here) to indirectly weight observations generated by rreg and then using vce(cluster) option:

Code:

rreg y x, gen(weights) reg y x [w=weights], vce(cluster clustervar)

The coefficient estimates of both the above regressions are identical, as they should be. The only difference is in the standard errors. Any comments on whether this can be done?
Tags: None
Yiting Cao

Join Date: Jan 2016

Posts: 23
#2

12 Dec 2016, 21:10

Hi Girish,

I have the same question as yours. May I ask whether you find out the answer ?
Thanks!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3015
#3

13 Dec 2016, 12:04

I do not think that is available in Stata. More importantly, I would advise against the use of rreg.

Best regards,

Joao
Comment
Yiting Cao

Join Date: Jan 2016

Posts: 23
#4

13 Dec 2016, 20:24

Hi Joao,

Thank you for replying to this question! May I ask why you against the use of rreg? I'm actually struggling with how to deal with the outliers in my model, or how to define the "outliers".

Best,
Yiting
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#5

13 Dec 2016, 23:55

Yiting:
about the progressive side-tracking of -rreg-, see Richard Williams' reply at: http://www.statalist.org/forums/foru...-rreg-v-robreg

Kind regards,
Carlo
(Stata 19.0)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3015
#6

14 Dec 2016, 00:41

Dear Yiting,

I wrote a paper about it, but in short the problem is that it is not clear what -rreg- estimates; this applies to many of the other so-called robust regression methods.

The outliers are defined with respect to a model, it may be that if you change the model the "outliers" disappear. Also, in many cases, the outliers are a feature of the data and you do not want to "deal" with them in any way. Finally, you may consider estimating something that is less sensitive to outliers such as median regression,

Best wishes,

Joao
1 like
Comment
Yiting Cao

Join Date: Jan 2016

Posts: 23
#7

14 Dec 2016, 06:45

Hi Joao,

Good Morning! Thanks a lot! I'll read the paper. I do notice that using robust regression changes the sign of some coefficients in my model. But excluding the "outliers" with extreme residual, leverage or Cook's D does not affect the results at all. It seems that rreg's results are not that reliable.

Best,

Yiting
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#8

14 Dec 2016, 06:52

Yiting: No, that's not evidence against rreg. It is supposed to be resistant to possible outliers. So, you should expect that removing them makes much difference to the model fitted.

I don't think that rreg is a good command to use and have been saying so on Statalist for several years, but I won't support fallacious arguments against it.

Last edited by Nick Cox; 14 Dec 2016, 07:08.
Comment
Yiting Cao

Join Date: Jan 2016

Posts: 23
#9

14 Dec 2016, 08:36

Hi Cox, I'm a little bit confused. My understanding of rreg is that: it applies 0 weights to observations with Cook's D>1 and lower weights to observations with large Cook's D.

What I find is that if I use rreg my results are significantly affected. BUT if I run OLS with observations with large Cook's D (or leverage or residual) dropped (defined in different ways such as Cook's D>4/n, or top 10 or 20 Cook's D), the results are very similar to those without any adjustments to outliers.

I want to know whether the outliers in my regression significant affect my results, using rreg suggests a big effect from outliers, while excluding high Cook's D observation suggests no big effect. Which one should I trust?

Thanks!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

14 Dec 2016, 09:20

Yiting:
if you have already ruled out that outliers are the offspring of erroneous data entry, I would be very cautious in dropping them.
By removing outliers, you are doing a sort of make-up to your original sample; hence, the results of your regression refer to the made-up sample instead of the original one.

Kind regards,
Carlo
(Stata 19.0)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#11

14 Dec 2016, 09:45

Yiting:

I agree with Carlo Lazzaro and want to add more.

I think you're asking for a single correct view and in my judgement there isn't one. Statistical people do disagree about this.

Some very smart people still pursue "robust statistics" but my guess is that they are searching for a Holy Grail that doesn't exist. That doesn't rule out robust-resistant statistics as helpful in various ways, especially for description and exploration.

Crudely rreg is about fitting Xb and getting the right answer even when the data are awkward. My own experience is that is usually the wrong question altogether, quite apart from real uncertainty about what the right answer would be even within that framework. Almost always, I would say, it is immensely more fruitful to find a model in the light of which the data no longer appear awkward. exp(Xb) is, for example, often a better starting point.

Similarly, I don't buy any implicit premise here that outliers are simply identifiable and stand out like elephants in a flower shop. What is, and is not, apparently an outlier is utterly model-dependent. Unless outliers are just impossible values, in which case they can be flagged and omitted, on direct scientific grounds, outliers are not self-evident in my experience. Using a logarithmic scale alone tames almost all of what my colleagues and students are tempted to call outliers.
Comment

Announcement

Cluster Standard Errors with rreg

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment