Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MM robust regression residuals

    Hello, I am following a specific econometrics model in Finance where I need to carry MM robust regression and then I need to extract the estimated values and residuals values of this regression. I downloaded the MM robust regression package in STATA and did my regression analysis by using "mmregress" command.

    I need the residuals, estimated values and r square of this regression. But I cannot get it through any of the conventional procedure. for example when I try to extract residuals by typing "predict residuals, residuals" It says "option not allowed r(198). When I try to retrieve residuals from the menu by clicking on postestimation>predicitons, residuals etc. It says "this operation requires that you previously performed an estimation". Please help. I will highly appreciate any feedback from valuable members. Thanks.

  • #2
    If you take another look at the help for mmregress, you will see that the outlier option creates the Robust Standardized Residual (residual divided by a robust estimate of scale) as the variable S_stdres; the estimate of scale is in the results section, but is also given in the returned scalar e(scale). Notice that the Stata Journal article doesn't mention r-square. Usually it's the squared correlation of observed and predicted, but mmregress downweights or excludes outliers and high leverage points, thus the usual measure is not correct and you compute it at your own risk.
    Code:
    sysuse auto, clear
    replace mpg = 80 in 5
    mmregress mpg turn weight, outlier graph
    gen residual = S_stdres*e(scale)
    gen predicted = mpg -residual
    See the associated Stata Journal article.

    Verardi, V., and C. Croux. 2009. Robust regression in Stata. Stata Journal 9, no. 3: 439-453,
    Last edited by Steve Samuels; 10 Apr 2016, 19:56.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      thanks a lot . thats very helpful indeed. i got it up and running. thanks again!

      Comment


      • #4
        http://www.stata.com/support/faqs/st...red/index.html is sympathetic to the idea that you can knit your own analog[ue] to R-squared by just squaring the correlation between observed and predicted. But that measure plays no part, even implicitly, in the robust regression. It might be of some use descriptively as a summary of how far the robust regression is from the usual version. (It would be even more important to keep track of changes in coefficients and graphically to monitor how the regression fit falls relative to the data.)

        Comment


        • #5
          Hello , I got a reply from the real author of this model. He suggested this procedure below for post-estimation and R square. I thought it would be helpful for others.

          let's say you estimate model

          mmregress y x1 x2 x3

          you can predict the residuals by doing:

          predict yhat
          gen res=y-yhat

          for the R2 there are several options. Probably the easyest is to do 1-(s1/s2)^2 where s1 is the residual scale of the complete model and s2 is the scale of a model with just a constant.


          Comment


          • #6
            Thanks for sharing this, Sanullah. This version is akin to adjusted \(R^2\) in ordinary regression and would be my first choice. Would you please tell us which of the esteemed authors of mmregress wrote to you and the other options he suggested. Perhaps you can quote from the original email.
            Last edited by Steve Samuels; 11 Apr 2016, 21:17.
            Steve Samuels
            Statistical Consulting
            [email protected]

            Stata 14.2

            Comment


            • #7
              I received answer from Professor
              Vincenzo Verardi. He did not suggest any other options. I just copied his email reply completely in the above discussion.

              Comment


              • #8
                I think much depends on whether you want to compare robust regressions with each other or with plain regressions.

                Comment


                • #9
                  I believe this question fits in this thread. I'm having problems getting the residuals from mmregress right. When calculated in two different ways they don't match:

                  mmregress mpg weight, outlier
                  predict p, xb
                  gen r=mpg-p
                  gen r2=S_stdres*e(scale)
                  pwcorr r2 r

                  | r2 r
                  -------------+------------------
                  r2 | 1.0000
                  r | 0.9983 1.0000

                  sum r2 r

                  Variable | Obs Mean Std. Dev. Min Max
                  -------------+---------------------------------------------------------
                  r2 | 74 .957506 3.515488 -5.348241 15.82534
                  r | 74 .8222398 3.472092 -5.646473 15.42817



                  From my understanding the results ough to be identical, but they aren't. The result is the same when p is calculated manually from the coefficents. Where do I go wrong?

                  Regards

                  Jonas Selmeryd

                  Originally posted by Steve Samuels View Post
                  If you take another look at the help for mmregress, you will see that the outlier option creates the Robust Standardized Residual (residual divided by a robust estimate of scale) as the variable S_stdres; the estimate of scale is in the results section, but is also given in the returned scalar e(scale). Notice that the Stata Journal article doesn't mention r-square. Usually it's the squared correlation of observed and predicted, but mmregress downweights or excludes outliers and high leverage points, thus the usual measure is not correct and you compute it at your own risk.
                  Code:
                  sysuse auto, clear
                  replace mpg = 80 in 5
                  mmregress mpg turn weight, outlier graph
                  gen residual = S_stdres*e(scale)
                  gen predicted = mpg -residual
                  See the associated Stata Journal article.

                  Verardi, V., and C. Croux. 2009. Robust regression in Stata. Stata Journal 9, no. 3: 439-453,

                  Comment


                  • #10
                    Good catch. The difference occurs because mmregress does two robust regressions. The initial one is an S-regression to estimate the scale parameter. This S-regression one can be displayed by adding the initial option to mmregress) According to the Verardi-Croux article (p 443), the "S-estimator" is very robust (can tolerate up to 50% outliers), but has low efficiency (high standard errors) at a Gaussian distribution. Therefore, as a second step, the program does MM regression, but with the scale parameter fixed at that produced at the first step. The residual from the S-regression is the one produced by
                    Code:
                    gen r2=S_stdres*e(scale)
                    This isl illustrated in the following log. Here r3 is the observed- predicted residual from the S-estimator.
                    Code:
                    . set seed 438205
                    
                    . sysuse auto, clear
                    
                    . mmregress mpg weight, outlier
                    The total number of p-subsets to check is 20
                    ------------------------------------------------------------------------------
                             mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                          weight |  -.0052036   .0003089   -16.85   0.000    -.0058194   -.0045879
                           _cons |   36.18723   1.139205    31.77   0.000     33.91627    38.45819
                    ------------------------------------------------------------------------------
                    Scale parameter=  1.890264
                    
                    . predict p1, xb
                    
                    . gen r1 = mpg-p1
                    
                    . gen r2=S_stdr*e(scale)
                    
                    . mmregress mpg weight, outlier initial  // S-estimator
                    The total number of p-subsets to check is 20
                    ------------------------------------------------------------------------------
                             mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                          weight |  -.0049362   .0004281   -11.53   0.000    -.0057896   -.0040827
                           _cons |   35.24437   1.554421    22.67   0.000     32.14569    38.34305
                    ------------------------------------------------------------------------------
                    Scale parameter=  1.890267
                    
                    . predict p3, xb
                    
                    . gen r3  = mpg-p3
                    
                    . sum r1 r2 r3
                    
                        Variable |        Obs        Mean    Std. Dev.       Min        Max
                    -------------+---------------------------------------------------------
                              r1 |         74    .8222416    3.472094  -5.646461   15.42819
                              r2 |         74    .9576024     3.51551  -5.348072   15.82556
                              r3 |         74    .9575548    3.515501   -5.34815   15.82546
                    
                    . corr r1 r2 r3
                                 |       r1       r2       r3
                    -------------+---------------------------
                              r1 |   1.0000
                              r2 |   0.9983   1.0000
                              r3 |   0.9983   1.0000   1.0000
                    Now a request: FAQ 12 asks that you post all code and results between CODE delimiters, described there. Please do so in the future. It isn't enough to use a monospace font, because, as you can see, column headings and dividers don't line up. That isn't a serious issue here, but can be with more extensive output.
                    Last edited by Steve Samuels; 05 Jun 2016, 11:32.
                    Steve Samuels
                    Statistical Consulting
                    [email protected]

                    Stata 14.2

                    Comment


                    • #11
                      Sorry about the missing CODE delimiters. This was my first post on statalist.org. I will do better next time!

                      If I interpret what you say correctly it is better to use observed-predicted than e(scale)*S_stdres since the latter is without the MM-estimation, yes?

                      Comment


                      • #12
                        Your do interpret correctly: It's better to use the observed-predicted, as that is the final fit.
                        Steve Samuels
                        Statistical Consulting
                        [email protected]

                        Stata 14.2

                        Comment

                        Working...
                        X