Hi all
I am working on data from a randomized controlled experiment where subjects were randomly assigned to one of four groups and given a shopping task in a virtual online grocery store. I'll denote the groups as: Control, WL, T, WLT. The outcome I am interested in is the number of products containing red meat in the shopping basket at checkout. In the present analysis, I am examining interactions between a number of subject characteristics and the effect of each treatment (e.g., WL vs. Control). I am only interested in associations, so I fit separate Poisson regressions with an indicator variable for the study arm, the moderator of interest (e.g., gender), and their interaction. To see if the treatment effects differ by these suject characteristics, I compute these effects separately at each level of the moderator and do a homogeneity test using contrast.
I am having trouble understanding the results for two of the moderators of interest, namely summary measures of a subject's interest in health and interest in sustainability. These measures are simple averages of items measured on a 1-5 Likert scale (two items for the health measure, five for the sustainability measure). For now I am treating these as continuous variables because they take non-integer values. I follow the procedure above conditioning on values 1, 2, 3, 4, and 5. In each case I get strong rejections, but when I compute pairwise comparisons, none is statistically significant. OLS finds no interaction and the point estimates and standard errors for the conditional treatment effects are similar to the Poisson ones. Similarly, when I treat the moderators as categorical variables (alternately rounding up and down), contrast's results agree with OLS. Interestingly, contrast's output shows that the degrees of freedom for the problematic case (Poisson with continuous moderator) are not equal to 4 (the number of restrictions I am testing), but 3 (one restriction gets dropped).
Code and output for interest in health below. Diagnostics plots not shown for conciseness (slight nonnormality of the Anscombe residuals in the tails and no apparent relationships in the residual vs. fitted plots). Code and output for the Poisson regressions with categorical moderator in #2 (character limit).
The P values that I think should more or less correspond yet do not are in red; in OLS these are just the P values for each interaction term because of linearity, i.e. if they are significant the difference in treatment effects at int_health = x and int_health = y will be significant, I think.
Thanks for any help anyone can provide (and apologies for the long post but more information is better than less).
Maxime
Preliminaries:
Poisson, int_health continuous:

OLS, int_health continuous:
I am working on data from a randomized controlled experiment where subjects were randomly assigned to one of four groups and given a shopping task in a virtual online grocery store. I'll denote the groups as: Control, WL, T, WLT. The outcome I am interested in is the number of products containing red meat in the shopping basket at checkout. In the present analysis, I am examining interactions between a number of subject characteristics and the effect of each treatment (e.g., WL vs. Control). I am only interested in associations, so I fit separate Poisson regressions with an indicator variable for the study arm, the moderator of interest (e.g., gender), and their interaction. To see if the treatment effects differ by these suject characteristics, I compute these effects separately at each level of the moderator and do a homogeneity test using contrast.
I am having trouble understanding the results for two of the moderators of interest, namely summary measures of a subject's interest in health and interest in sustainability. These measures are simple averages of items measured on a 1-5 Likert scale (two items for the health measure, five for the sustainability measure). For now I am treating these as continuous variables because they take non-integer values. I follow the procedure above conditioning on values 1, 2, 3, 4, and 5. In each case I get strong rejections, but when I compute pairwise comparisons, none is statistically significant. OLS finds no interaction and the point estimates and standard errors for the conditional treatment effects are similar to the Poisson ones. Similarly, when I treat the moderators as categorical variables (alternately rounding up and down), contrast's results agree with OLS. Interestingly, contrast's output shows that the degrees of freedom for the problematic case (Poisson with continuous moderator) are not equal to 4 (the number of restrictions I am testing), but 3 (one restriction gets dropped).
Code and output for interest in health below. Diagnostics plots not shown for conciseness (slight nonnormality of the Anscombe residuals in the tails and no apparent relationships in the residual vs. fitted plots). Code and output for the Poisson regressions with categorical moderator in #2 (character limit).
The P values that I think should more or less correspond yet do not are in red; in OLS these are just the P values for each interaction term because of linearity, i.e. if they are significant the difference in treatment effects at int_health = x and int_health = y will be significant, I think.
Thanks for any help anyone can provide (and apologies for the long post but more information is better than less).
Maxime
Preliminaries:
Code:
. summarize redmeat, detail Number of red meat items purchased ------------------------------------------------------------- Percentiles Smallest 1% 0 0 5% 0 0 10% 1 0 Obs 3,518 25% 2 0 Sum of Wgt. 3,518 50% 3 Mean 3.144116 Largest Std. Dev. 1.717456 75% 4 9 90% 5 9 Variance 2.949654 95% 6 10 Skewness .0920231 99% 7 12 Kurtosis 2.792986 . histogram redmeat, discrete percent (start=0, width=1) . tabstat redmeat, stats(variance mean) by(arm) format(%4.2f) nototal Summary for variables: redmeat by categories of: arm (Arm) arm | variance mean --------+-------------------- Control | 2.84 3.50 WL | 3.22 3.24 T | 2.71 3.07 WLT | 2.73 2.76 ----------------------------- .
Code:
. glm redmeat arm##c.int_health, family(poisson) vce(robust) Iteration 0: log pseudolikelihood = -6773.6041 Iteration 1: log pseudolikelihood = -6765.4898 Iteration 2: log pseudolikelihood = -6765.4867 Iteration 3: log pseudolikelihood = -6765.4867 Generalized linear models Number of obs = 3,490 Optimization : ML Residual df = 3,482 Scale parameter = 1 Deviance = 3814.697748 (1/df) Deviance = 1.095548 Pearson = 3146.128499 (1/df) Pearson = .9035406 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AIC = 3.881654 Log pseudolikelihood = -6765.486681 BIC = -24590.26 ---------------------------------------------------------------------------------- | Robust redmeat | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------+---------------------------------------------------------------- arm | WL | .075889 .089114 0.85 0.394 -.0987712 .2505493 T | -.1667416 .0964557 -1.73 0.084 -.3557913 .0223081 WLT | -.1822114 .103245 -1.76 0.078 -.3845679 .0201451 | int_health | -.0812 .0167178 -4.86 0.000 -.1139663 -.0484337 | arm#c.int_health | WL | -.0439101 .0241868 -1.82 0.069 -.0913154 .0034952 T | .008172 .0261055 0.31 0.754 -.0429938 .0593378 WLT | -.0174026 .0279498 -0.62 0.534 -.0721832 .037378 | _cons | 1.562273 .0624578 25.01 0.000 1.439858 1.684688 ---------------------------------------------------------------------------------- . . // Diagnostics . . gof // Program for deviance and Pearson goodness-of-fit tests as per estat gof after poisson Goodness-of-fit test statistic Prob > chi2(3482) Deviance = 3814.698 0.0001 Pearson = 3146.128 1.0000 . predict anscombe, anscombe (28 missing values generated) . predict deviance, deviance (28 missing values generated) . predict pearson, pearson (28 missing values generated) . summarize anscombe deviance pearson Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- anscombe | 3,490 -.1254524 1.065312 -3.14559 4.024217 deviance | 3,490 -.1144572 1.039349 -2.965691 3.980549 pearson | 3,490 -.0000243 .9495935 -2.09706 5.334105 . qnorm anscombe, msize(small) mfcolor(navy%50) mlwidth(thin) . predict fitted (option mu assumed; predicted mean redmeat) (28 missing values generated) . foreach r in anscombe deviance pearson { 2. lowess `r' fitted, msize(small) mfcolor(navy%50) mlwidth(thin) lineopts(lcolor(midblue)) name(`r'_lowess, replace) 3. } . drop fitted anscombe deviance pearson . . // Moderation tests . . testparm arm#c.int_health ( 1) [redmeat]2.arm#c.int_health = 0 ( 2) [redmeat]3.arm#c.int_health = 0 ( 3) [redmeat]4.arm#c.int_health = 0 chi2( 3) = 4.88 Prob > chi2 = 0.1812 . margins, dydx(arm) at(int_health = (1(1)5)) post vsquish Conditional marginal effects Number of obs = 3,490 Model VCE : Robust Expression : Predicted mean redmeat, predict() dy/dx w.r.t. : 2.arm 3.arm 4.arm 1._at : int_health = 1 2._at : int_health = 2 3._at : int_health = 3 4._at : int_health = 4 5._at : int_health = 5 ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.arm | (base outcome) -------------+---------------------------------------------------------------- 2.arm | _at | 1 | .1429054 .2956879 0.48 0.629 -.4366322 .7224431 2 | -.0480895 .1792539 -0.27 0.788 -.3994208 .3032417 3 | -.2030383 .0995889 -2.04 0.041 -.3982291 -.0078476 4 | -.3272395 .0859704 -3.81 0.000 -.4957384 -.1587405 5 | -.4252858 .1252607 -3.40 0.001 -.6707922 -.1797794 -------------+---------------------------------------------------------------- 3.arm | _at | 1 | -.6448573 .2885458 -2.23 0.025 -1.210397 -.0793178 2 | -.5661724 .1783523 -3.17 0.002 -.9157364 -.2166084 3 | -.4956238 .0981094 -5.05 0.000 -.6879146 -.3033329 4 | -.4324359 .084127 -5.14 0.000 -.5973219 -.26755 5 | -.3759039 .1304803 -2.88 0.004 -.6316406 -.1201673 -------------+---------------------------------------------------------------- 4.arm | _at | 1 | -.7957709 .2993894 -2.66 0.008 -1.382563 -.2089784 2 | -.791002 .1826672 -4.33 0.000 -1.149023 -.4329808 3 | -.7812252 .099314 -7.87 0.000 -.9758771 -.5865733 4 | -.7673363 .0840066 -9.13 0.000 -.9319863 -.6026863 5 | -.7501138 .1282987 -5.85 0.000 -1.001575 -.498653 ------------------------------------------------------------------------------ Note: dy/dx for factor levels is the discrete change from the base level. . marginsplot, ylabel(-2(1)2) xlabel(.) xtitle("Interest in health (lowest to highest)") legend(order(4 "WL" 5 "T" 6 "WLT") row(1)) name(healthPOIS, replace) Variables that uniquely identify margins: int_health _deriv . contrast _at, atequations vsquish Contrasts of conditional marginal effects Number of obs = 3,490 Model VCE : Robust Expression : Predicted mean redmeat, predict() dy/dx w.r.t. : 2.arm 3.arm 4.arm 1._at : int_health = 1 2._at : int_health = 2 3._at : int_health = 3 4._at : int_health = 4 5._at : int_health = 5 ------------------------------------------------ | df chi2 P>chi2 -------------+---------------------------------- 1b.arm | _at | (omitted) -------------+---------------------------------- 2.arm | _at | 3 33.76 0.0000 -------------+---------------------------------- 3.arm | _at | 3 33.32 0.0000 -------------+---------------------------------- 4.arm | _at | 3 98.37 0.0000 ------------------------------------------------ . pwcompare _at, atequations effects vsquish Pairwise comparisons of conditional marginal effects Model VCE : Robust Number of obs = 3,490 Expression : Predicted mean redmeat, predict() dy/dx w.r.t. : 2.arm 3.arm 4.arm 1._at : int_health = 1 2._at : int_health = 2 3._at : int_health = 3 4._at : int_health = 4 5._at : int_health = 5 ------------------------------------------------------------------------------ | Contrast Delta-method Unadjusted Unadjusted | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.arm | (base outcome) -------------+---------------------------------------------------------------- 2.arm | _at | 2 vs 1 | -.190995 .1251792 -1.53 0.127 -.4363418 .0543518 3 vs 1 | -.3459438 .228219 -1.52 0.130 -.7932449 .1013573 4 vs 1 | -.4701449 .3124983 -1.50 0.132 -1.08263 .1423405 5 vs 1 | -.5681912 .3808993 -1.49 0.136 -1.31474 .1783578 3 vs 2 | -.1549488 .1030833 -1.50 0.133 -.3569883 .0470907 4 vs 2 | -.2791499 .1874477 -1.49 0.136 -.6465407 .0882408 5 vs 2 | -.3771962 .2559733 -1.47 0.141 -.8788947 .1245023 4 vs 3 | -.1242011 .0844072 -1.47 0.141 -.2896362 .0412339 5 vs 3 | -.2222474 .1530172 -1.45 0.146 -.5221557 .0776608 5 vs 4 | -.0980463 .0686529 -1.43 0.153 -.2326035 .0365109 -------------+---------------------------------------------------------------- 3.arm | _at | 2 vs 1 | .0786849 .1177887 0.67 0.504 -.1521767 .3095465 3 vs 1 | .1492335 .2196351 0.68 0.497 -.2812433 .5797104 4 vs 1 | .2124214 .3072645 0.69 0.489 -.389806 .8146487 5 vs 1 | .2689534 .3822329 0.70 0.482 -.4802094 1.018116 3 vs 2 | .0705487 .1018483 0.69 0.489 -.1290703 .2701677 4 vs 2 | .1337365 .1894815 0.71 0.480 -.2376405 .5051135 5 vs 2 | .1902685 .2644557 0.72 0.472 -.3280552 .7085922 4 vs 3 | .0631878 .0876352 0.72 0.471 -.108574 .2349497 5 vs 3 | .1197198 .1626134 0.74 0.462 -.1989965 .4384361 5 vs 4 | .056532 .0749802 0.75 0.451 -.0904265 .2034904 -------------+---------------------------------------------------------------- 4.arm | _at | 2 vs 1 | .004769 .1244258 0.04 0.969 -.239101 .248639 3 vs 1 | .0145457 .229435 0.06 0.949 -.4351386 .4642301 4 vs 1 | .0284346 .3174964 0.09 0.929 -.593847 .6507162 5 vs 1 | .0456571 .3907904 0.12 0.907 -.7202779 .8115922 3 vs 2 | .0097768 .1050169 0.09 0.926 -.1960526 .2156062 4 vs 2 | .0236656 .1930936 0.12 0.902 -.3547909 .4021222 5 vs 2 | .0408882 .2664102 0.15 0.878 -.4812662 .5630426 4 vs 3 | .0138889 .0880844 0.16 0.875 -.1587534 .1865311 5 vs 3 | .0311114 .1614165 0.19 0.847 -.2852591 .3474819 5 vs 4 | .0172225 .07334 0.23 0.814 -.1265212 .1609663 ------------------------------------------------------------------------------ Note: dy/dx for factor levels is the discrete change from the base level. .
OLS, int_health continuous:
Code:
. regress redmeat arm##c.int_health Source | SS df MS Number of obs = 3,490 -------------+---------------------------------- F(7, 3482) = 26.58 Model | 520.290212 7 74.3271732 Prob > F = 0.0000 Residual | 9738.00893 3,482 2.79667115 R-squared = 0.0507 -------------+---------------------------------- Adj R-squared = 0.0488 Total | 10258.2991 3,489 2.94018319 Root MSE = 1.6723 ---------------------------------------------------------------------------------- redmeat | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------+---------------------------------------------------------------- arm | WL | .207299 .3561368 0.58 0.561 -.4909589 .905557 T | -.6925493 .3646463 -1.90 0.058 -1.407491 .0223929 WLT | -.8232045 .3700983 -2.22 0.026 -1.548836 -.0975729 | int_health | -.2942289 .0656121 -4.48 0.000 -.4228709 -.1655869 | arm#c.int_health | WL | -.1320273 .091971 -1.44 0.151 -.3123498 .0482952 T | .0644555 .0944202 0.68 0.495 -.1206691 .2495801 WLT | .0139352 .0957076 0.15 0.884 -.1737134 .2015838 | _cons | 4.630014 .2548842 18.17 0.000 4.130276 5.129752 ---------------------------------------------------------------------------------- . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of redmeat chi2(1) = 0.57 Prob > chi2 = 0.4511 . testparm arm#c.int_health ( 1) 2.arm#c.int_health = 0 ( 2) 3.arm#c.int_health = 0 ( 3) 4.arm#c.int_health = 0 F( 3, 3482) = 1.63 Prob > F = 0.1803 . margins, dydx(arm) at(int_health = (1(1)5)) post vsquish Conditional marginal effects Number of obs = 3,490 Model VCE : OLS Expression : Linear prediction, predict() dy/dx w.r.t. : 2.arm 3.arm 4.arm 1._at : int_health = 1 2._at : int_health = 2 3._at : int_health = 3 4._at : int_health = 4 5._at : int_health = 5 ------------------------------------------------------------------------------ | Delta-method | dy/dx Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.arm | (base outcome) -------------+---------------------------------------------------------------- 2.arm | _at | 1 | .0752718 .267286 0.28 0.778 -.4487813 .5993249 2 | -.0567555 .1815697 -0.31 0.755 -.4127492 .2992383 3 | -.1887828 .1068204 -1.77 0.077 -.3982197 .0206542 4 | -.32081 .082286 -3.90 0.000 -.4821437 -.1594763 5 | -.4528373 .138017 -3.28 0.001 -.7234397 -.1822348 -------------+---------------------------------------------------------------- 3.arm | _at | 1 | -.6280937 .2733197 -2.30 0.022 -1.163977 -.0922107 2 | -.5636382 .1851236 -3.04 0.002 -.9265999 -.2006766 3 | -.4991827 .1080193 -4.62 0.000 -.7109703 -.2873951 4 | -.4347272 .083042 -5.24 0.000 -.5975431 -.2719113 5 | -.3702717 .1412591 -2.62 0.009 -.6472306 -.0933127 -------------+---------------------------------------------------------------- 4.arm | _at | 1 | -.8092692 .2774444 -2.92 0.004 -1.353239 -.2652991 2 | -.795334 .1878774 -4.23 0.000 -1.163695 -.4269729 3 | -.7813988 .1092719 -7.15 0.000 -.9956422 -.5671554 4 | -.7674636 .083082 -9.24 0.000 -.9303579 -.6045693 5 | -.7535284 .1420731 -5.30 0.000 -1.032083 -.4749733 ------------------------------------------------------------------------------ Note: dy/dx for factor levels is the discrete change from the base level. . marginsplot, ylabel(-2(1)2) xlabel(.) xtitle("Interest in health (lowest to highest)") legend(order(4 "WL" 5 "T" 6 "WLT") row(1)) name(healthOLS, replace) Variables that uniquely identify margins: int_health _deriv .
Comment