Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic regression - Influential outliers

    Hi,

    I have a dataset of 463 observations. The dependent variable is binary and I am working on a logistic regression. I am new to the concept of outliers, leverage and influence.
    I do not intend to delete outliers but describe their impact on my model.
    My problem is that i can not get Stata to use the ´rstudent´ or ´cooksd´ command after i make my regression. As far as I understand I should be able to use Cooks Distance to identify influential outliers. I have only been able to make Pearson residuals and calculate leverage.

    Anyone with more practise in Stata who see any problems in the above?

    Kind regards

  • #2
    The spost13 package (findit spost13_ado) includes a command called leastlikely. According to the help,

    For regression models for categorical dependent variables, leastlikely lists the in-sample observations with the lowest predicted probabilities of observing the outcome value that was actually observed. For example, in a model with a binary dependent variable, leastlikely lists the observations that have the lowest predicted probability of depvar=0 among those cases for which depvar=0, and it lists the observations that have the lowest predicted probability of depvar=1 among those cases for which depvar=1. The least likely values represent relatively deviant cases that may warrant closer inspection.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Thank you, I will look into that package tomorrow.
      I would still very much like to hear about using studentized residuals and cooks distance in Stata. I simply can't make them work and it seems like a fairly common way of describing/identifying influential outliers.

      Thank you in advance

      Comment


      • #4
        The predict options after logistic and glm offer several influence statistics.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment


        • #5
          You can also use regress and some of the options after regress if the issues are on the rhs and you're just doing checks.

          Comment

          Working...
          X