Dear Statalists,
I am having difficulties with understanding the interaction effects between categorical variables. I have read many posts here and web pages including Richard Williams' posts and slides regarding this issue, but I am still puzzled.
I am testing an interaction effect with logistic regression. After regression, I ran testparm and margin, dydx. The results are not pointing to the same direction regarding whether the interaction effect I am testing is statistically significant. I am inclined to report the results of margin in my paper and want to make sure that my understanding is correct.
Logistic regression and tesparm results:
Margin with dydx and post Wald tests:
The regression and testparm show that the interaction effects are not significant. Var1 does not differ across var2.
But margin effects seem to tell a different story. According to margin, var1 is significant when var2=1 and non-significant when var2 is larger than 1. However, the differences between them do not reach statistical significance.
I would like to understand why the results seem to point to different directions and how I should intrepret the findings.
Many thanks.
Regards,
JiYuan.
I am having difficulties with understanding the interaction effects between categorical variables. I have read many posts here and web pages including Richard Williams' posts and slides regarding this issue, but I am still puzzled.
I am testing an interaction effect with logistic regression. After regression, I ran testparm and margin, dydx. The results are not pointing to the same direction regarding whether the interaction effect I am testing is statistically significant. I am inclined to report the results of margin in my paper and want to make sure that my understanding is correct.
Logistic regression and tesparm results:
Code:
. logit outcome ib1.var1##ib1.var2 /// > ib5.var3 ib5.var4 c.var5 ib5.var6 ib0.var7 /// > [pw=xw] Iteration 0: log pseudolikelihood = -5438.9395 Iteration 1: log pseudolikelihood = -4806.458 Iteration 2: log pseudolikelihood = -4758.1175 Iteration 3: log pseudolikelihood = -4757.7004 Iteration 4: log pseudolikelihood = -4757.7004 Logistic regression Number of obs = 4535 Wald chi2(22) = 329.65 Prob > chi2 = 0.0000 Log pseudolikelihood = -4757.7004 Pseudo R2 = 0.1253 ------------------------------------------------------------------------------ | Robust outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 0.var1 | -.5820021 .1613879 -3.61 0.000 -.8983167 -.2656876 | var2 | 3 | -.4572081 .4521419 -1.01 0.312 -1.34339 .4289737 4 | -.6190117 .4763957 -1.30 0.194 -1.55273 .3147067 7 | -.3674928 .253224 -1.45 0.147 -.8638028 .1288172 | var1#var2 | 0 3 | .8116348 .5281965 1.54 0.124 -.2236114 1.846881 0 4 | .4436761 .5312465 0.84 0.404 -.5975478 1.4849 0 7 | .2697532 .2712406 0.99 0.320 -.2618687 .8013751 | var3 | 1 | .2479472 .1651902 1.50 0.133 -.0758196 .571714 2 | .0922704 .1805643 0.51 0.609 -.2616291 .4461699 3 | -.1055011 .2245428 -0.47 0.638 -.545597 .3345947 4 | -.3489315 .2234298 -1.56 0.118 -.7868459 .088983 | var4 | 1 | 1.663486 .1979195 8.40 0.000 1.275571 2.051401 2 | 1.272541 .1850998 6.87 0.000 .9097521 1.63533 3 | 1.104931 .1806004 6.12 0.000 .7509613 1.458902 4 | .7161381 .172235 4.16 0.000 .3785638 1.053712 | var5 | .2392388 .1058836 2.26 0.024 .0317107 .4467668 | var6 | 0 | -.5285216 .2047854 -2.58 0.010 -.9298935 -.1271497 1 | -.7029717 .2321125 -3.03 0.002 -1.157904 -.2480396 2 | -.1592464 .217776 -0.73 0.465 -.5860795 .2675866 3 | -.3754231 .20313 -1.85 0.065 -.7735507 .0227044 4 | .021854 .2181081 0.10 0.920 -.4056299 .449338 | 1.var7 | .09903 .1801096 0.55 0.582 -.2539784 .4520384 _cons | -.9562587 .9037396 -1.06 0.290 -2.727556 .8150385 ------------------------------------------------------------------------------ . testparm i.var1#i.var2 ( 1) [outcome]0.var1#3.var2 = 0 ( 2) [outcome]0.var1#4.var2 = 0 ( 3) [outcome]0.var1#7.var2 = 0 chi2( 3) = 3.20 Prob > chi2 = 0.3616 . end of do-file
Margin with dydx and post Wald tests:
Code:
. margins , dydx(var1) over(var2) post Average marginal effects Number of obs = 4535 Model VCE : Robust Expression : Pr(outcome), predict() dy/dx w.r.t. : 0.var1 over : var2 ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 0.var1 | var2 | 1 | -.0790368 .0202814 -3.90 0.000 -.1187876 -.039286 3 | .0318166 .0714287 0.45 0.656 -.108181 .1718143 4 | -.0250947 .0896887 -0.28 0.780 -.2008812 .1506919 7 | -.0515102 .0346325 -1.49 0.137 -.1193886 .0163681 ------------------------------------------------------------------------------ Note: dy/dx for factor levels is the discrete change from the base level. . . test [0.var1]1.var2 =[0.var1]7.var2 ( 1) [0.var1]1bn.var2 - [0.var1]7.var2 = 0 chi2( 1) = 0.48 Prob > chi2 = 0.4897 . test [0.var1]1.var2 =[0.var1]4.var2 ( 1) [0.var1]1bn.var2 - [0.var1]4.var2 = 0 chi2( 1) = 0.34 Prob > chi2 = 0.5594 . test [0.var1]1.var2 =[0.var1]3.var2 ( 1) [0.var1]1bn.var2 - [0.var1]3.var2 = 0 chi2( 1) = 2.23 Prob > chi2 = 0.1351 .
The regression and testparm show that the interaction effects are not significant. Var1 does not differ across var2.
But margin effects seem to tell a different story. According to margin, var1 is significant when var2=1 and non-significant when var2 is larger than 1. However, the differences between them do not reach statistical significance.
I would like to understand why the results seem to point to different directions and how I should intrepret the findings.
Many thanks.
Regards,
JiYuan.
Comment