Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reason not to use MM regress or MS regress

    Hi all,

    Sorry to open up this thread, but I have tried searching the older threads for a while now.
    The reason for this is that I read somewhere in an old thread yesterday that not everybody is pro the use of -mmregress- or -msregress-, since it is not exactly clear what it calculates... (or something in that direction)
    I am trying to go back to that thread, but cannot find it anymore unfortunately.

    So is there anyone who knows why it might be preferred to use -xtreg, re robust cluster(id)- in a random effects model, rather than -msregress- (I am using dummies, therefore -msregress- instead of -mmregress-.

    Thank you!

  • #2
    I can address at least part of this

    What is biting here is a very common confusion. Asking for robust standard errors (which have various other names, one or more of Eicker, Huber, White or sandwich, and perhaps yet others) is not at all the same as robust regression, which is a loose super-family of families of regression methods, with broad general aim to be less sensitive to outliers or the tails of error distributions.

    The term robust in this general sense can be identified as introduced by George Box in 1953. See Stephen Stigler's 2010 paper for more. There is a naughty copy at http://www.econ.uiuc.edu/~roger/cour...Robustness.pdf

    That aside, I see no competition here in practice. mmregress and msregress (which as you are asked to explain are community-contributed commands from the Stata Journal)

    SJ-10-2 st0173_1 . . . . . . . . . . . . . . . . . . Software update for mcd
    . . . . . . . . . . . . . . . . . . . . . . . V. Verardi and C. Croux
    (mmregress, sregress, msregress, mregress, mcd if installed)
    Q2/10 SJ 10(2):313
    outlier option replaced by generate() option in mcd

    SJ-9-3 st0173 . . . . . . . . . . . . . . . . . . Robust regression in Stata
    . . . . . . . . . . . . . . . . . . . . . . . V. Verardi and C. Croux
    (mmregress, sregress, msregress, mregress, mcd if installed)
    Q3/09 SJ 9(3):439--453
    provides alternatives to rreg and qreg for robust-to-outlier
    regression; presents a graphical tool that recognizes the
    type of detected outliers

    have precisely nothing to do with random effects model for panel data.

    Comment


    • #3
      Thank you for your answer Nick! I really appreciate your effort!

      The last sentence worries me:
      have precisely nothing to do with random effects model for panel data.
      do you mean that -mmregess- or -msregress- should not be used for a random effects model for panel data?

      Comment


      • #4
        Barbara:
        I suspect, as Nick said, that you're mixing up two different meaning of -robust-ness.
        As far as panel data analysis is concerned (I assume that your main methodological interest focuses on that), under -xtreg- standard errors are clustered/robustified when you suspect/have evidence of heteroskedasticity and/or autocorrelation (the latter usually bites less hard when it comes to N>T panel dataset, which are the ones suitable for -xtreg-).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          It's more than "should not". As I understand it, such models are not even on offer. Which part of the syntax or code supports (a) panel data (b) random effects?

          Comment


          • #6
            Thank you Carlo!
            Okay, maybe I am (indeed) interpreting -msregress- wrong. I thought that I could run the -msregress- instead of the -xtreg- and then be 'more safe' from the outliers.
            I thought the coefficients and p-values of my variables coming out of the -msregress- would be the ones I should report.
            I am quite certain that there are some outliers in my data, therefore I wanted to use a regression which controls for outliers....
            Did I assume this for no reason?

            Comment


            • #7
              If you want to use xtreg and get a fit is robust to outliers, then robust standard errors won't achieve that. The coefficient estimates are just the same! Possibly, transformation may help to tame the outliers.

              Comment


              • #8
                Barbara:
                as a general rules, set aside those instances when outliers are simply the offspring of a mistaken data entry, outliers are simply a matter of fact.
                However, differences do exist in diverse research fields: physicians are often more comfortable with the median, because it gets rid of ouliers; (health) economists are usually more interested in the mean, in part for their lazyness (mean*# of observations=total; the same approach would not work with median), and, jokes aside, because overall costs usually follow a Gamma (and not a Normal) distribution: hence the right tail of the distribution can be informative.
                All that said, I cannot remember on this list any thread dealing with panel data analysis that proposed -msregress- vs -xtreg- (-xtgls-) when the dependent variable was continuous.

                PS: crossed in the cyberspace with Nick's helpful reply.
                Last edited by Carlo Lazzaro; 25 May 2018, 03:09.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thank you both for your time and effort!!!!
                  Unfortunately, already transformed my data in logarithms and cube roots, so I guess outliers are simply a part of the data...
                  If -msregress- is not the right way of trying to do a regression putting less weight on these outliers, what would be the best way of doing that?
                  I know there is much disagreement on how to handle outliers, but just running an -xtreg, re cluster(id)- (after already searching for mistaken data entry and transforming my data) ending up with skewed residuals and leaving it like that doesn't sound like something that is appropriate to me... BUT correct me if I am wrong!!

                  Comment


                  • #10
                    What do you regard as an outlier? Please make this graphic by showing us e.g. a scatter plot matrix. or multiple quantile plots (http://fmwww.bc.edu/repec/usug2016/cox_uksug16.pptx is a survey).

                    Comment


                    • #11
                      Barbara:
                      once you have excluded the risk of endogeneity in your regression model, there's nothing more you can do than imposing cluster standard errors.
                      By the way, cluster/robust option will not straighten the residual distribution, but will simply ask Stata to deal with it.
                      Transformations are welcomed whenever you can easily manage the way back (ie, re-transforming to the original metrics).
                      I'm pretty familiar with log-linear models (which can at times fix omitted variable and heteroskedasticity problems), works well with positively skewed distribution and are easy enough to explain to an audience with a limited smattering of statistics,
                      Conversely, I cannot say anything about cubic-root transformation.

                      PS: crossed in the cyberspace with Nick's helpful reply, that focuses on visual inspection (never enough recommended).
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Sorry had to read the FAQs before posting the graphs in the right way!
                        Thanks for answering me!
                        I see an outlier as an observation with a large absolute residual value.
                        So for this regression:
                        independent variable: size =ln(assets)
                        dependent: dummies for post connected and postconnected, cube root leverage lagged 1 year, profitability as ln(ebitda/assets) lagged 1 year, ln(capex) lagged 1 year and cube root of cash/assets lagged 1 year.

                        Code:
                        xtreg size post##conn cblev_lag1 prof_lag1 lcapex_lag1 cbca_lag1 i.sic2 i.year, re cluster(_j)
                        I ran the regression and found out the residuals were skewed. Here are the quantiles of all variables (and the image of the qnorm residuals in the next post due to the limit)


                        Thank you for your effort!

                        Attached Files

                        Comment


                        • #13
                          qnorm res
                          Attached Files

                          Comment


                          • #14
                            Unfortunately, you didn't present graphs the right way. gph attachments are not ideal at all. At best people have to go back and forth between such graphs opened in Stata and your question in this forum. At worst people can't open your graphs at all -- which will apply to people using devices (mobile/cell phones, tablets, etc.) that don't have Stata loaded and to people using older versions of Stata if their version doesn't support the graph file format in your Stata.

                            We do spell this out in the FAQ

                            12.4 Posting image attachments: please do use .png

                            Stata graphs should be posted as .png file attachments (start with the Clipboard icon).

                            See next section 12.5 for why other images (e.g. screenshots) are usually less helpful than you imagine.

                            12.5 Posting attachments: please don't...

                            There are several "please don't" requests here, but good reasons for them all.

                            Please do not post .gph files, as they can't be read without flipping back and forth between Stata and the forum software, thus making your posts much more difficult to follow.

                            Further, there is not an equivalence between big residuals and outliers as some outliers will pull the fit towards them. That is why I asked to see raw data.
                            Last edited by Nick Cox; 25 May 2018, 04:52.

                            Comment


                            • #15
                              Oh no! I am so sorry, misread the text in my hurry I think!
                              Hereby I hope to provide them correctly!
                              Attached Files

                              Comment

                              Working...
                              X