Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to interpret goodness of fit for multivariate logistic regression model

    Dear STATA technicians,

    I am Hatem Ali, STATA user. Serial number: 301506305680
    I am attaching to this email multivariate logistic regression analysis I have done on STATA.
    I am trying to calculate the goodness of fit for this model.
    However, I cant find a way to do that?
    Is it the Pseudo R2 square?
    Is there a certain number above which a goodness of fit is good?

    Looking forward to hear back from you

    Regards,
    Hatem Ali

    Attached Files

  • #2
    For a logistic regression, we normally assess the model for both discrimination and calibration. Discrimination is a measure of the model's ability to distinguish those observations with 0 outcomes from those with 1 outcomes. Calibration is a measure of how closely the predicted probabilities match the observed probabilities.

    Discrimination is usually measured using the area under the receiver operating characteristic (ROC) curve, also referred to as the Harrell C-statistic. Calibration is most commonly measured with the Hosmer-Lemeshow statistic. You can get the ROC curve area by running the -lroc- command after your logistic command runs. Then you can also get the Hosmer-Lemeshow statistic by running -estat gof, group(10)-.

    Some comments on the Hosmer-Lemeshow statistic. Personally, I find it more helpful to see the actual comparison between observed and expected outcome probabilities in the deciles of risk than the summary chi square (and I don't find the p-value useful at all). So I generally run that as -estat gof, group(10) table-. In addition, specifying 10 groups is rather conventional, but in large samples I recommend specifying a larger number of groups. Generally I like each group to have 50 to 100 observations in it.

    Please read the Forum FAQ for excellent advice on how to get the most out of your Statalist activity. You will find there, among other things, that attachments are discouraged. Attachments of Word documents are especially discouraged because some people who might respond to your question will be reluctant to download any attachment from a stranger, especially one like a Word document that might contain active malware. (I have not looked at your attachment, which accounts for the highly generic nature of my response.) When it is appropriate to show results (and, in this case, it is), the better way is to copy/paste from Stata's Results window or your log file directly into the Forum, and surround it with code delimiters. Instructions on the use of code delimiters will be found in FAQ #12 if you are not familiar with them. You can also learn about them from david Benson's video https://youtu.be/bXfaRCAOPbI.

    Comment


    • #3
      Dear, Thanks for getting back to me. This is very helpful.
      Hw can I check assumptions and residuals for the logistic regression model?

      Comment

      Working...
      X