Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Studentized deleted residuals and DFfits after logistic regression in Stata. How to calculate?

    How can I calcilate studentized deleted (externally, jackknifed) residuals and dffits after performing logistic regression in Stata? The rstudent and dfits postestimation commands are available only after regres but not the logit.

  • #2
    I'll rephrase the question a bit.
    The rstudent and dfits postestimation commands in Stata are available only after regres but not the logistic regression. It makes me think that studentized deleted residuals and DFfits are not applicable in logistic regression. But many literature sources say the opposite: it is possible and necessary in logistic regression.
    **The first question: will it be correct to calculate these statistics in logistic regression?**
    **Second question: how to do it in STATA**?
    Maybe I need to change the link type or modify the variables (x1, x2, x3 and/or y), apply the regress command and after that apply rstudent and dfits? Or do I need to install some additional module?

    Comment


    • #3
      You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

      Whether it makes sense or not, I can't tell you, but you can use predict after logit to get predicted values, Pearson residuals and standardized residuals. If by studentized you mean something other than standardized, you can easily do the calculation using the output of predict.

      Comment


      • #4
        Hi Alexey
        As you correctly found, rstudent and dfit options are only allowed after the regress command. There are other statistics that an be useful for what you are trying to do, that are compatible (within Stata) with the idea of the leave on out residual and/or the leverage statistics (which are related to the two options you have found).
        look into "help logit postestimation##predict" Perhaps one of those options is what you need.
        In addition, if you have found literature that says the opposite, then just dig-in and see how exatly they suggest those statistics can be obtained. In other words, if Stata cannot do what you want it to do, you can teach it how to do it.
        HTH

        Comment


        • #5
          See Yulia Marchenko's blog entry "Using resampling methods to detect influential points" here.
          Steve Samuels
          Statistical Consulting
          [email protected]

          Stata 14.2

          Comment


          • #6
            Hi Alexy. When it comes to residuals, be aware that terminology can vary across different stats packages and authors. Here are a couple slides I use to point out differences between SPSS and Stata, for example. The example on slide 205 is from a linear regression model. HTH.

            Click image for larger version

Name:	IntroBiostats_09_slide204.png
Views:	1
Size:	27.4 KB
ID:	1481803 Click image for larger version

Name:	IntroBiostats_09_slide205.png
Views:	1
Size:	25.2 KB
ID:	1481804
            --
            Bruce Weaver
            Email: [email protected]
            Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
            Version: Stata/MP 18.0 (Windows)

            Comment


            • #7
              Dear colleagues, thank you very much for your answers and advice.
              Analysis of the remains was not as simple as it might seem at first glance. There really is a big difference in terminology in different packages.
              I need to calculate external studentized (deleted) residues:
              https://newonlinecourses.science.psu...t501/node/339/

              Unfortunately, the Stat does not allow to do it by standard procedures and they need to be calculated only manually.

              2Steve Samuels: thanks, I guess — Jackknife estimation is the only way to calculate it.
              If I understand correctly, this can be calculated as:
              Usual residual_i/sqrt[MSE_i-(1-h_i)]
              where h is leverage and MSE is a mean square error of the model in which observation i is deleted.
              Please tell me how I can calculate (jackknifed) MSE?

              Comment


              • #8
                Hi Alexey. As my slide in #6 notes, what SPSS refers to as a studentized deleted residual is called just a studentized residual in Stata documentation. So I think that predict with the rstudent option will give you what you want. Consider the following example, using data from the online notes you pointed to.

                Code:
                // Try to match the Minitab output on this page:
                // https://newonlinecourses.science.psu.edu/stat501/node/401/
                
                clear *
                import delimited ///
                https://newonlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/influence2/index.txt , ///
                delimiter(whitespace, collapse)
                
                regress y x
                predict resid, resid // raw residual
                predict rstan, rstandard // standardized residual
                predict rstud, rstudent  // studentized residual
                format resid-rstud %8.5f
                list x-rstud if _n < 4 | _n > 18, sep(3)
                
                // "Again, the studentized deleted residuals appear in the column
                // labeled "TRES1." Minitab reports that the studentized deleted
                // residual for the red data point is t_21 = 6.69013."
                
                // What the author of the web-page calls TRES1 matches what
                // I have called rstud--i.e., it is the same thing that Stata
                // documentation refers to as the studentized residual.
                Here's the output from the -list- command at the end.

                Code:
                . list x-rstud if _n < 4 | _n > 18, sep(3)
                
                     +----------------------------------------------------+
                     |       x         y      resid      rstan      rstud |
                     |----------------------------------------------------|
                  1. |      .1    -.0716   -3.53297   -0.82635   -0.81917 |
                  2. |  .45401    4.1673   -1.07734   -0.24915   -0.24291 |
                  3. | 1.09765    6.5703   -1.91658   -0.43544   -0.42596 |
                     |----------------------------------------------------|
                 19. | 8.70156   46.5475   -0.24289   -0.05561   -0.05414 |
                 20. | 9.16463   45.7762   -3.34684   -0.77680   -0.76838 |
                 21. |       4        40   16.89298    3.68110    6.69013 |
                     +----------------------------------------------------+
                Compare that to the Minitab output on the website:



                Re this table, the author says: "Again, the studentized deleted residuals appear in the column labeled "TRES1." Minitab reports that the studentized deleted residual for the red data point is t21 = 6.69013."

                HTH.
                --
                Bruce Weaver
                Email: [email protected]
                Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
                Version: Stata/MP 18.0 (Windows)

                Comment


                • #9
                  Dear Bruce, thank you! Yes, I experimented with Stata, SPSS and Statistica (StatSoft). It seems indeed there are big differences in terminology.
                  I need to get these residuals after logistic regression. "rstudent" it is what I need. But this postestimation procedure is available after "regress" but not the "logit". Looks like it can only be calculated manually.

                  Comment


                  • #10
                    I see no way calculating rstudent, because there is no term in the logistic equations that corresponds to the mean square error in multiple regression. I suggest that you look at the influence statistics given by predict after logistic.
                    Steve Samuels
                    Statistical Consulting
                    [email protected]

                    Stata 14.2

                    Comment


                    • #11
                      Thank you! Now everything is clear.

                      Comment

                      Working...
                      X