Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Best ways to test melogit model fit

    Hi everyone!

    I am a beginner in using multilevel models and I am currently trying to interpret my models and evaluate their goodness of fit.
    I have survey data, which includes 60 000 individuals (level 1) from 19 countries (level 2). The dependent variable is dichotomous.

    I am using melogit with fixed and random effects, to understand the goodness of fit I have tried using:
    • estat icc, with the results: ICC 0.0355061 and Std.Err 0.0113379.
    • lrtest with logit and melogit, which produced the results: LR chi2(1) = 1057.57; Prob > chi2 = 0.0000
    According to what I've read, the lrtest shows that adding the county level makes the model better. But, the icc is very low, which should mean that the country level does not explain a significant part of the variation of the dependent variable. Are my interpretations of these stats correct? Which is more important? And what other tests might be useful to understand the goodness of fit and to compare different melogit models (regarding goodness of fit)? Are Wald chi2 or Log likelihood good candidates here?

    [Using Stata 15]

    Thanks!

  • #2
    According to what I've read, the lrtest shows that adding the county level makes the model better.
    Better is not a statistical term, and the likelihood ratio test is not a measure of model fit. What the test means is that the data observed are unlikely to have been generated from a population in which there is zero variation at the country level. Of course, it is hard to imagine a real world situation in which there would actually be zero variation at the country level, so this is your classical straw-man null hypothesis significance test. Basically it tells you that your sample size is large enough and your measurements precise enough that you can verify what you already must have known, namely, that there is some variation at the country level, that the ICC is not exactly zero. In fact, it is probably telling you that your sample size is so large that statistical significance has no connection at all to meaningfulness.

    But, the icc is very low, which should mean that the country level does not explain a significant part of the variation of the dependent variable.
    It means that of the variance not explained by the fixed effects in the model, about 3.6% of that is occurring at the country level. Whether you want to consider that "significant" (a word I dislike here because it is unclear whether you mean statistically significant or practically significant and the two are quite different concepts) depends on your context.

    If you want to study how well your model fits the data, rather than doing single summary statistics, you should actually look at the fit of the data. Do a scatterplot of the predicted values against the observed values. Partition the data into bins defined by some percentiles of predicted outcome probability and then compare the number of predicted and observed positive outcomes in each bin. (This is like a Hosmer-Lemeshow statistic, but don't bother with the chi square because in a sample this large it will inevitably be statistically significant, and because in a multi-level model the distribution may well not be approximately chi square anyway.)

    Comparing different models depends on what you want the models to do for you. If your primary goal is to use the models to predictively discriminate positive and negative outcomes, then calculate the area under the ROC curve for the models and pick the highest. If you want a model that gives the most accurate predicted probabilities, then do the Hosmer-Lemeshow like process discussed above and pick the one you like best based on that. If you want the model that best identifies the association of some particular predictor to the outcome, choose the one that gives the smallest standard error for that coefficient. Better and best are only definable with respect to particular research goals and you have to use the approach that is related to your particular goals.

    If you are particularly concerned about whether or not including country-level variation is useful, here is an approach I often use. Look at the actual variance component at the country level, and take the square root so you have a standard deviation. Now consider an "average" observation. Suppose that you changed that "average" observation so that it's country-level intercept were, say, 3 standard deviations out (in either direction). How much would that change the predicted probability? Is the effect of 3 standard deviations in the country-level intercept just a drop in the bucket compared to the other effects in the model? Or would it be appreciable? Then ask a similar question about a "typical" observation that has a low predicted probability, and one that has a high predicted probability. If throwing in 3 sd's worth of country-level variation makes only a negligible difference in the predicted probability no matter what, then the country-level variation is just a little flourish that can be omitted with no material consequences. But if it would make a meaningful change in some outcome probabilities, then it is an important part of your model (assuming that the actual predicted probabilities are important to your research goals.)

    Comment


    • #3
      Thank you Clyde for that through answer! I will try to follow your instructions.

      Comment

      Working...
      X