Reason not to use MM regress or MS regress

Nick Cox

Join Date: Mar 2014

Posts: 35631
#16

25 May 2018, 05:54

Thanks for the graphs. I don't see anything to worry about. Some mild outliers. Nothing that makes me think that you need anything special.
Comment

Barbara Heerkens

Join Date: May 2018
Posts: 54

#17

25 May 2018, 06:23

Thanks! That sounds good!
Sorry for asking another question, but in my previous thoughts outliers had everything to do with skewness (I know a bit better now).
But what do I do with -skttest size- (and all other variables) telling me that the variable is skewed
And also: isnt there a problem somewhere when my residuals are not normally distributed as the -sktes res- tells me

Code:

. sktest size

                    Skewness/Kurtosis tests for Normality
                                                          ------ joint ------
    Variable |        Obs  Pr(Skewness)  Pr(Kurtosis) adj chi2(2)   Prob>chi2
-------------+---------------------------------------------------------------
        size |        458     0.0000        0.0000           .         0.0000

.

Code:

 sktest res

                    Skewness/Kurtosis tests for Normality
                                                          ------ joint ------
    Variable |        Obs  Pr(Skewness)  Pr(Kurtosis) adj chi2(2)   Prob>chi2
-------------+---------------------------------------------------------------
         res |        220     0.0067        0.0003       16.95         0.0002

.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35631
#18

25 May 2018, 08:13

I don't put much faith in these tests. They are sensitive and do what they are designed to do, but they remind one of worlds depicted by magazines with perfect scenery, homes, gardens, gadgets and partners of the appropriate sex, all looking completely flawless. In research with typical observational data, you have to put up with less than perfection, or else reject all models [sic] as being imperfect.

Here is something reproducible that you can copy and run in a new Stata session. With your sample size of 220 I sample from a t distribution with 20 df. So, the test works: it detects kurtosis not as expected. But the distribution is better than one often gets with residuals that "should be" normal, as the graph will reveal.

Code:

clear set obs 220 set seed 2803 gen t20 = rt(20) sktest t20 qnorm t20

Now your residuals are worse. But the reaction is not so much "This is bad!" but "Can I improve the model?" and I don't know. I can't see your data and don't know your field and don't know how well the models are expected to work. Perhaps others can help. My contribution is just to say that it is nice when the residuals look really well behaved, but sometimes you can't get far with simple models.
2 likes
Comment
Barbara Heerkens

Join Date: May 2018

Posts: 54
#19

25 May 2018, 08:25

Okay thank you, this was really helpfull!
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#20

25 May 2018, 13:01

Dear All,

I may have been responsible for the comment that Barbara Heerkens mentioned in #1. My problem with the so-called robust regression is that it is not clear to me what is being estimated. Most "robust" estimators are presented as methods to estimate the (conditional) mean that are less sensitive to outliers than OLS. However, by definition, the (conditional) mean is sensitive to the tails for the distribution and any estimator that is resistant to outliers can only estimate the mean under very specific conditions. In my view, if we are worried about outliers we should estimate robust measures of location (e.g., median or mode) rather that trying to find a robust estimator for a measure of location that is sensitive to outliers (the mean). This point of view is explained in this paper.

Best wishes,

Joao
Comment
Barbara Heerkens

Join Date: May 2018

Posts: 54
#21

26 May 2018, 03:17

Thank you Joao!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment