In a recent post (http://www.statalist.org/forums/foru...coeff-of-logit), Richard Williams provided a link to P. Allison's answer to the question "When can you safely ignore multicollinearity?" (http://statisticalhorizons.com/multicollinearity). However, I have the following concern about this answer.
My basic model (excluding control variables) is as follows:
y = c0 + c1 · xm + c2 · xx + c3 · mm, where y is the dependent variable, and xm, xx and mm are dummy variables (I use OLS).
I would also like to analyze the effect of the interaction terms between another continuous and centered variable (zs) and xm, xx and mm. The resulting model is:
(1) y = c0 + c1 · xm + c2 · xx + c3 · mm + c4 · zs + c5 · xm · zs + c6 · xx · zs + c3 · mm · zs.
In this case, the command -collin xm xx mm zs xm·zs xx·zs mm·zs- indicates that zs has a VIF equal to 15.01. Although this value is above a threshold of 10, according to P. Allison, ignoring a potential multicollinearity problem would be safe in this case.
However, I could perform my analysis in a different way. Specifically, I could use different regression equations for each interaction effect. For instance, for the interaction effect corresponding to xm, the model would be now:
(2) y = c0 + c1 · xm + c2 · xx + c3 · mm + c4 · zs + c5 · xm · zs
In this case, although I am using interaction terms, collin indicates that there is no variable with a VIF above 10. Indeed, the VIF associated to zs is 3.37.
My question is whether I should stick to (1). Or is the the VIF of zs in (1) a good enough reason to use (2) instead of (1)?
Thanks in advance.
My basic model (excluding control variables) is as follows:
y = c0 + c1 · xm + c2 · xx + c3 · mm, where y is the dependent variable, and xm, xx and mm are dummy variables (I use OLS).
I would also like to analyze the effect of the interaction terms between another continuous and centered variable (zs) and xm, xx and mm. The resulting model is:
(1) y = c0 + c1 · xm + c2 · xx + c3 · mm + c4 · zs + c5 · xm · zs + c6 · xx · zs + c3 · mm · zs.
In this case, the command -collin xm xx mm zs xm·zs xx·zs mm·zs- indicates that zs has a VIF equal to 15.01. Although this value is above a threshold of 10, according to P. Allison, ignoring a potential multicollinearity problem would be safe in this case.
However, I could perform my analysis in a different way. Specifically, I could use different regression equations for each interaction effect. For instance, for the interaction effect corresponding to xm, the model would be now:
(2) y = c0 + c1 · xm + c2 · xx + c3 · mm + c4 · zs + c5 · xm · zs
In this case, although I am using interaction terms, collin indicates that there is no variable with a VIF above 10. Indeed, the VIF associated to zs is 3.37.
My question is whether I should stick to (1). Or is the the VIF of zs in (1) a good enough reason to use (2) instead of (1)?
Thanks in advance.
Comment