Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predicted probabilities after multilevel multinomial logistic regression with random effects

    I am trying to get predicted probabilities of a 7-category level-1 variable after running a multinomial logistic regression model with a random effect for the level 2 variable. There are 47,142 observations in the data at level 1, and 175 level 2 clusters. I tried this in a couple of different ways, using Stata 15.1. One was running multilevel logistic regressions with 7 different binary variables created based off the categorical variable, including a random effect at level 2, and then calculating probability based on the intercept (below is one example, where group1=1 if group=1, and 0 otherwise):
    Code:
    melogit group1 || lev2id
    di exp(_b[_cons])/(1+exp(_b[_cons]))
    However, doing this results in probabilities that do not sum to 1. When I do this with two-level categorical variables, the probabilities do sum to 1. So I suspect this has something to do with not adequately accounting for the other categories, and the fact that the random effect for level 2 varies substantially across these models. The other way I tried to do it was using gsem:
    Code:
    gsem (i.group<-M[lev2id]@1), mlogit
    margins
    This does get probabilities that sum to 1; however, the probabilities change depending on the reference group or base outcome. The predicted probability that I get for the reference group matches the predicted probability that I get using the above method (melogit) for that group.

    My question is, what is the correct way (if any) to get predicted probabilities after a multilevel multinomial model, that sum to 1 and are consistent regardless of reference group?

    Sorry if any of this is unclear or if I have missed providing any information.

  • #2
    First, I noticed a typo. In the above post, it should be:
    Code:
    melogit group1 || lev2id:
    For anyone else who's curious or who runs into this issue, I was told by a reputable source that I wasn't adequately accounting for the random effect. I was told you add the variance of the random effect to the variance of the standard logistic distribution (pi^2/3), divide the intercept by the square root of that number to get a z-score, and then get the proportion above that z-score on the standard normal distribution:
    Code:
    melogit group1 || lev2id:
    di normal(_b[_cons]/sqrt(_b[/var(_cons[lev2id])]+(_pi^2/3)))
    This seems to be working! I at least get probabilities that sum to 1, and look similar enough to the univariate proportions (e.g., from -tab group1-).

    Comment

    Working...
    X