Hi all:
I've put together a multilevel (binomial) logistic model using 'meqrlogit'. The model appears to have a very good fit, and the coefficient standard errors are all more than acceptability small for the purposes of the model (and all p-values are <0.001).
However, the model includes an interaction term and this seems to be causing multicollinearity. Of note, this is not because the standard errors grew large when I added the interaction term (they remained small), but rather because of some prelimary VIFs I calculated (which was ~40 for the interaction term and one of the predictors).
Theory suggests the model would not be correctly specified if the interaction term were dropped.
I have two related questions:
1) How to test for mutlicollinearity when using meqrlogit?
Searching this forum, I see that multicollinearity is a property of the data (here), meaning I can just run a series of single level linear regressions (using 'regress') and calculate the VIFs for each using R^2.
Is this correct if I'm using multilevel logistic regression (meqrlogit)? Or do I need to do an equivalent single-level test using logistic regression?
2) Do I have a multicollinearity problem?
Despite the high VIFs (for now assuming I've calculated these correctly), and based on another previous discussion on this forum (here), my understanding is that I don't have a problem.
As I understand it, multicollinearity may affect individual coefficients in the model, but I am not trying to assess the separate effects of each variable in the model.
Rather, the purpose of the model is to make predictions: as I understand it, multicollinearity does not significantly affect model predictions. Is that correct?
If this is correct, I'll need to chase down a reference supporting it. I've found a few online blogs/discussions on multicollinearity and prediction, but I have not come across a textbook or paper. If anyone knows of any, please do let me know.
Thanks.
I've put together a multilevel (binomial) logistic model using 'meqrlogit'. The model appears to have a very good fit, and the coefficient standard errors are all more than acceptability small for the purposes of the model (and all p-values are <0.001).
However, the model includes an interaction term and this seems to be causing multicollinearity. Of note, this is not because the standard errors grew large when I added the interaction term (they remained small), but rather because of some prelimary VIFs I calculated (which was ~40 for the interaction term and one of the predictors).
Theory suggests the model would not be correctly specified if the interaction term were dropped.
I have two related questions:
1) How to test for mutlicollinearity when using meqrlogit?
Searching this forum, I see that multicollinearity is a property of the data (here), meaning I can just run a series of single level linear regressions (using 'regress') and calculate the VIFs for each using R^2.
Is this correct if I'm using multilevel logistic regression (meqrlogit)? Or do I need to do an equivalent single-level test using logistic regression?
2) Do I have a multicollinearity problem?
Despite the high VIFs (for now assuming I've calculated these correctly), and based on another previous discussion on this forum (here), my understanding is that I don't have a problem.
As I understand it, multicollinearity may affect individual coefficients in the model, but I am not trying to assess the separate effects of each variable in the model.
Rather, the purpose of the model is to make predictions: as I understand it, multicollinearity does not significantly affect model predictions. Is that correct?
If this is correct, I'll need to chase down a reference supporting it. I've found a few online blogs/discussions on multicollinearity and prediction, but I have not come across a textbook or paper. If anyone knows of any, please do let me know.
Thanks.
Comment