Hi Forum Users,
I frequently use interaction terms in regress and feel comfortable interpreting them. My understanding is that for a simple multigroup interaction term (e.g., i.iv##c.iv nteraction term), the interacted terms are added to the linear effect of the omitted group.
I modelling four groups and their interactions with a 4-point measure of belongingness. I am predicting a binary outcome.
Here is the logistic output:
My naive interpretation of this is that:
For G1 (the omitted group), continuous negatively predicts the outcome (slope = -0.41)
For G2, continuous negatively predicts the outcome (slope = ~-0.406 + .150 = -0.25).
For G3, continuous negatively predicts the outcome (slope = ~-0.406 + .207 = -0.19).
For G4, continuous negatively predicts the outcome (slope = ~-0.406 + -0.514 = -0.91).
Out of these four G4 has the most negative slope. But when graphed, G1's relationship is easily the most negative and all other relationships are functionally flat. Moreover, when I re-run the same analyses with simple OLS (as a check on the original model), the results for G4 look quite different.
Importantly, the margins sub-command following both the logit and the regress commands produce similar looking graphs (again G1 is easily the most negative of the slopes), with G4 having a flat relationship.
To be clear, I am not expecting equivalent results with regress and with logit, but the slope difference of the two is quite surprising. <marginsplot> for either approach is consistent with the output for OLS, but the coefficient estimates for logit are counterintuitive. I am not sure if the models are disagreeing with each other, or if they are answering two technically different questions.
Continuous is on a four-point scale and people in G4 were unlikely to report saying yes to the outcome at all. I was expecting larger error terms for the model, but not a complete flipping of the linear effect.
Can someone explain what is going on?
Thanks!
David.
I frequently use interaction terms in regress and feel comfortable interpreting them. My understanding is that for a simple multigroup interaction term (e.g., i.iv##c.iv nteraction term), the interacted terms are added to the linear effect of the omitted group.
I modelling four groups and their interactions with a 4-point measure of belongingness. I am predicting a binary outcome.
Code:
. . tab group dv | RECODE of pdis_05l | (Discrimination - | Physical/mental | disability - 2 yrs | before C group | No Yes | Total -----------+----------------------+---------- Base | 3,059 2,057 | 5,116 G2 | 12,196 410 | 12,606 G3 | 311 23 | 334 G4 | 17,594 43 | 17,637 -----------+----------------------+---------- Total | 33,160 2,533 | 35,693
Here is the logistic output:
Code:
.. logit dv b1.group##c.continuous Iteration 0: log likelihood = -9061.6711 Iteration 1: log likelihood = -6288.6974 Iteration 2: log likelihood = -5603.9077 Iteration 3: log likelihood = -5479.1011 Iteration 4: log likelihood = -5462.7962 Iteration 5: log likelihood = -5461.9443 Iteration 6: log likelihood = -5461.9426 Iteration 7: log likelihood = -5461.9426 Logistic regression Number of obs = 35,230 LR chi2(7) = 7199.46 Prob > chi2 = 0.0000 Log likelihood = -5461.9426 Pseudo R2 = 0.3972 ------------------------------------------------------------------------------------ dv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------------+---------------------------------------------------------------- group | G2 | -3.358668 .1726963 -19.45 0.000 -3.697147 -3.02019 G3 | -2.748029 .7514086 -3.66 0.000 -4.220762 -1.275295 G4 | -4.196719 .4204416 -9.98 0.000 -5.02077 -3.372669 | continuous | -.406358 .0301248 -13.49 0.000 -.4654015 -.3473145 | group#c.continuous | G2 | .1501638 .0620267 2.42 0.015 .0285937 .2717338 G3 | .2074039 .245937 0.84 0.399 -.2746237 .6894315 G4 | -.5136521 .1687538 -3.04 0.002 -.8444035 -.1829008 | _cons | .6700306 .0827124 8.10 0.000 .5079173 .832144 ------------------------------------------------------------------------------------
For G1 (the omitted group), continuous negatively predicts the outcome (slope = -0.41)
For G2, continuous negatively predicts the outcome (slope = ~-0.406 + .150 = -0.25).
For G3, continuous negatively predicts the outcome (slope = ~-0.406 + .207 = -0.19).
For G4, continuous negatively predicts the outcome (slope = ~-0.406 + -0.514 = -0.91).
Out of these four G4 has the most negative slope. But when graphed, G1's relationship is easily the most negative and all other relationships are functionally flat. Moreover, when I re-run the same analyses with simple OLS (as a check on the original model), the results for G4 look quite different.
Code:
. regress dv b1.group##c.continuous Source | SS df MS Number of obs = 35,230 -------------+---------------------------------- F(7, 35222) = 2187.57 Model | 707.685172 7 101.097882 Prob > F = 0.0000 Residual | 1627.77395 35,222 .046214694 R-squared = 0.3030 -------------+---------------------------------- Adj R-squared = 0.3029 Total | 2335.45913 35,229 .066293654 Root MSE = .21498 ------------------------------------------------------------------------------------ dv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------------+---------------------------------------------------------------- group | G2 | -.601091 .0107249 -56.05 0.000 -.6221121 -.5800699 G3 | -.5544746 .0435127 -12.74 0.000 -.6397608 -.4691884 G4 | -.6469505 .0106044 -61.01 0.000 -.6677355 -.6261655 | continuous | -.0959357 .0030744 -31.20 0.000 -.1019617 -.0899097 | group#c.continuous | G2 | .0875945 .003739 23.43 0.000 .080266 .094923 G3 | .0834199 .0137352 6.07 0.000 .0564985 .1103413 G4 | .0933286 .0036355 25.67 0.000 .0862029 .1004544 | _cons | .6573691 .0086515 75.98 0.000 .6404118 .6743263 ------------------------------------------------------------------------------------
To be clear, I am not expecting equivalent results with regress and with logit, but the slope difference of the two is quite surprising. <marginsplot> for either approach is consistent with the output for OLS, but the coefficient estimates for logit are counterintuitive. I am not sure if the models are disagreeing with each other, or if they are answering two technically different questions.
Continuous is on a four-point scale and people in G4 were unlikely to report saying yes to the outcome at all. I was expecting larger error terms for the model, but not a complete flipping of the linear effect.
Can someone explain what is going on?
Thanks!
David.
Comment