Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diagnostics and residuals for ordered logistic regression (ologit)

    Dear Statalist members,

    I am fitting an ordered logistic regression model using ologit in Stata 19.5 BE with a 4-level ordinal outcome (limp_score). I have already conducted some standard diagnostics:
    • I tested the proportional odds assumption using estat parallel.
    • I assessed model fit using the user-written command ologitgof.
    Both tests suggest the model assumptions are reasonably satisfied.

    However, I am unsure about how to evaluate residuals or other diagnostic measures for an ordered logit model in Stata. In linear regression I would typically examine residual vs. fitted plots and other residual diagnostics, but it is not clear to me what the appropriate equivalent diagnostics are for ologit.

    Specifically, I would appreciate guidance on:
    1. What types of residuals or influence diagnostics are recommended for ordered logistic regression?
    2. Are there Stata commands or workflows to examine these (e.g., generalized residuals, deviance residuals, leverage, etc.)?
    3. Are plots of residuals vs. predicted values meaningful in this context, and if so, which residuals should be used?
    Any recommendations on best practices for post-estimation diagnostics for ordered logistic models in Stata would be greatly appreciated.

    Thank you very much for your help.

    Best regards,
    Paula Olivares Guzmán

  • #2
    You can get leverage values by running -regress- rather than -ologit- and using -predict- since the leverage values are a function of the explanatory variable values and not the estimation procedure.


    One can examine the influence of each observation on the coefficient estimates by estimating the model _N times and dropping out each observation in turn and capturing the coefficient values each time. Even though -dfbeta- doesn't work after -ologit-, you can emulate that on a do-it-yourself basis as illustrated below:

    Code:
    sysuse auto
    // Estimates from original model
    ologit rep78 weight i.foreign
    local k = e(k)
    mat B = e(b)
    gen byte insample = e(sample)
    //
    forval i = 1/`k' {
      qui gen dfb`i' = .
      local se`i' = sqrt(e(V)[`i', `i'])
    }
    //
    // Run regression, dropping each observation in turn
    forval i = 1/`=_N' {
       qui ologit rep78 weight i.foreign if (_n != `i')
       // Save the raw change in the estimated coefficient vs. the original model
       forval j = 1/`k' {
          qui replace dfb`j' = e(b)[1,`j'] - B[1,`j'] in `i' if insample
       }
    }
    // Scale each difference by the original se(b_j), as commonly is done for dfbetas, if you like
    forval j = 1/`k' {
       replace dfb`i' = dfb`j'/`se`j''
    }
    //
    browse dfb*
    Comments:
    1) I'd use something like this to scan which observations make a substantively interesting difference in any coefficient. I would not try to impose a statistically based definition of "how big is the dfbeta," but instead would just look to see if any of them are big enough to matter.
    2) There's presumably a more built-in way to do this with -jackknife-, but what I've done here makes it easy to capture the values of interest in your original data file.


    Comment

    Working...
    X