I am running a GLM with a Gamma family and log link, as my dependent variable is a strictly positive continuous variable with a heavy right-skewed distribution.
My independent variables are all categorical (binary):
X1 (2 categories): main effect of interest
X2 (binary): hypothesized moderator of X1
X3 (binary): hypothesized moderator of X1
My model is:
glm Y i.X1##i.X2 i.X1##i.X3, family(gamma) link(log) vce(robust)
Since Stata does not support vif after glm, I ran an equivalent OLS model and found that X3 and its interaction term i.X1#i.X3 both have VIF > 10, suggesting multicollinearity. Further inspection revealed that X3 is heavily imbalanced across categories of X1 (0.6% vs 15.2%).
My questions are:
My independent variables are all categorical (binary):
X1 (2 categories): main effect of interest
X2 (binary): hypothesized moderator of X1
X3 (binary): hypothesized moderator of X1
My model is:
glm Y i.X1##i.X2 i.X1##i.X3, family(gamma) link(log) vce(robust)
Since Stata does not support vif after glm, I ran an equivalent OLS model and found that X3 and its interaction term i.X1#i.X3 both have VIF > 10, suggesting multicollinearity. Further inspection revealed that X3 is heavily imbalanced across categories of X1 (0.6% vs 15.2%).
My questions are:
- Given that all my independent variables are binary, is VIF still a valid diagnostic for multicollinearity, or should I use GVIF instead?
- In the context of GLM (Gamma family), is multicollinearity a concern that requires correction?
- If correction is needed, what is the most appropriate approach given that:
- My theoretical hypothesis requires the X1 × X3 interaction term

Comment