Hi all
I am working on data from a randomized controlled experiment where subjects were randomly assigned to one of four groups and given a shopping task in a virtual online grocery store. I'll denote the groups as: Control, WL, T, WLT. The outcome I am interested in is the number of products containing red meat in the shopping basket at checkout. In the present analysis, I am examining interactions between a number of subject characteristics and the effect of each treatment (e.g., WL vs. Control). I am only interested in associations, so I fit separate Poisson regressions with an indicator variable for the study arm, the moderator of interest (e.g., gender), and their interaction. To see if the treatment effects differ by these suject characteristics, I compute these effects separately at each level of the moderator and do a homogeneity test using contrast.
I am having trouble understanding the results for two of the moderators of interest, namely summary measures of a subject's interest in health and interest in sustainability. These measures are simple averages of items measured on a 1-5 Likert scale (two items for the health measure, five for the sustainability measure). For now I am treating these as continuous variables because they take non-integer values. I follow the procedure above conditioning on values 1, 2, 3, 4, and 5. In each case I get strong rejections, but when I compute pairwise comparisons, none is statistically significant. OLS finds no interaction and the point estimates and standard errors for the conditional treatment effects are similar to the Poisson ones. Similarly, when I treat the moderators as categorical variables (alternately rounding up and down), contrast's results agree with OLS. Interestingly, contrast's output shows that the degrees of freedom for the problematic case (Poisson with continuous moderator) are not equal to 4 (the number of restrictions I am testing), but 3 (one restriction gets dropped).
Code and output for interest in health below. Diagnostics plots not shown for conciseness (slight nonnormality of the Anscombe residuals in the tails and no apparent relationships in the residual vs. fitted plots). Code and output for the Poisson regressions with categorical moderator in #2 (character limit).
The P values that I think should more or less correspond yet do not are in red; in OLS these are just the P values for each interaction term because of linearity, i.e. if they are significant the difference in treatment effects at int_health = x and int_health = y will be significant, I think.
Thanks for any help anyone can provide (and apologies for the long post but more information is better than less).
Maxime
Preliminaries:
Poisson, int_health continuous:

OLS, int_health continuous:
I am working on data from a randomized controlled experiment where subjects were randomly assigned to one of four groups and given a shopping task in a virtual online grocery store. I'll denote the groups as: Control, WL, T, WLT. The outcome I am interested in is the number of products containing red meat in the shopping basket at checkout. In the present analysis, I am examining interactions between a number of subject characteristics and the effect of each treatment (e.g., WL vs. Control). I am only interested in associations, so I fit separate Poisson regressions with an indicator variable for the study arm, the moderator of interest (e.g., gender), and their interaction. To see if the treatment effects differ by these suject characteristics, I compute these effects separately at each level of the moderator and do a homogeneity test using contrast.
I am having trouble understanding the results for two of the moderators of interest, namely summary measures of a subject's interest in health and interest in sustainability. These measures are simple averages of items measured on a 1-5 Likert scale (two items for the health measure, five for the sustainability measure). For now I am treating these as continuous variables because they take non-integer values. I follow the procedure above conditioning on values 1, 2, 3, 4, and 5. In each case I get strong rejections, but when I compute pairwise comparisons, none is statistically significant. OLS finds no interaction and the point estimates and standard errors for the conditional treatment effects are similar to the Poisson ones. Similarly, when I treat the moderators as categorical variables (alternately rounding up and down), contrast's results agree with OLS. Interestingly, contrast's output shows that the degrees of freedom for the problematic case (Poisson with continuous moderator) are not equal to 4 (the number of restrictions I am testing), but 3 (one restriction gets dropped).
Code and output for interest in health below. Diagnostics plots not shown for conciseness (slight nonnormality of the Anscombe residuals in the tails and no apparent relationships in the residual vs. fitted plots). Code and output for the Poisson regressions with categorical moderator in #2 (character limit).
The P values that I think should more or less correspond yet do not are in red; in OLS these are just the P values for each interaction term because of linearity, i.e. if they are significant the difference in treatment effects at int_health = x and int_health = y will be significant, I think.
Thanks for any help anyone can provide (and apologies for the long post but more information is better than less).
Maxime
Preliminaries:
Code:
. summarize redmeat, detail
Number of red meat items purchased
-------------------------------------------------------------
Percentiles Smallest
1% 0 0
5% 0 0
10% 1 0 Obs 3,518
25% 2 0 Sum of Wgt. 3,518
50% 3 Mean 3.144116
Largest Std. Dev. 1.717456
75% 4 9
90% 5 9 Variance 2.949654
95% 6 10 Skewness .0920231
99% 7 12 Kurtosis 2.792986
. histogram redmeat, discrete percent
(start=0, width=1)
. tabstat redmeat, stats(variance mean) by(arm) format(%4.2f) nototal
Summary for variables: redmeat
by categories of: arm (Arm)
arm | variance mean
--------+--------------------
Control | 2.84 3.50
WL | 3.22 3.24
T | 2.71 3.07
WLT | 2.73 2.76
-----------------------------
.
Code:
. glm redmeat arm##c.int_health, family(poisson) vce(robust)
Iteration 0: log pseudolikelihood = -6773.6041
Iteration 1: log pseudolikelihood = -6765.4898
Iteration 2: log pseudolikelihood = -6765.4867
Iteration 3: log pseudolikelihood = -6765.4867
Generalized linear models Number of obs = 3,490
Optimization : ML Residual df = 3,482
Scale parameter = 1
Deviance = 3814.697748 (1/df) Deviance = 1.095548
Pearson = 3146.128499 (1/df) Pearson = .9035406
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 3.881654
Log pseudolikelihood = -6765.486681 BIC = -24590.26
----------------------------------------------------------------------------------
| Robust
redmeat | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
arm |
WL | .075889 .089114 0.85 0.394 -.0987712 .2505493
T | -.1667416 .0964557 -1.73 0.084 -.3557913 .0223081
WLT | -.1822114 .103245 -1.76 0.078 -.3845679 .0201451
|
int_health | -.0812 .0167178 -4.86 0.000 -.1139663 -.0484337
|
arm#c.int_health |
WL | -.0439101 .0241868 -1.82 0.069 -.0913154 .0034952
T | .008172 .0261055 0.31 0.754 -.0429938 .0593378
WLT | -.0174026 .0279498 -0.62 0.534 -.0721832 .037378
|
_cons | 1.562273 .0624578 25.01 0.000 1.439858 1.684688
----------------------------------------------------------------------------------
.
. // Diagnostics
.
. gof // Program for deviance and Pearson goodness-of-fit tests as per estat gof after poisson
Goodness-of-fit test statistic Prob > chi2(3482)
Deviance = 3814.698 0.0001
Pearson = 3146.128 1.0000
. predict anscombe, anscombe
(28 missing values generated)
. predict deviance, deviance
(28 missing values generated)
. predict pearson, pearson
(28 missing values generated)
. summarize anscombe deviance pearson
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
anscombe | 3,490 -.1254524 1.065312 -3.14559 4.024217
deviance | 3,490 -.1144572 1.039349 -2.965691 3.980549
pearson | 3,490 -.0000243 .9495935 -2.09706 5.334105
. qnorm anscombe, msize(small) mfcolor(navy%50) mlwidth(thin)
. predict fitted
(option mu assumed; predicted mean redmeat)
(28 missing values generated)
. foreach r in anscombe deviance pearson {
2. lowess `r' fitted, msize(small) mfcolor(navy%50) mlwidth(thin) lineopts(lcolor(midblue)) name(`r'_lowess, replace)
3. }
. drop fitted anscombe deviance pearson
.
. // Moderation tests
.
. testparm arm#c.int_health
( 1) [redmeat]2.arm#c.int_health = 0
( 2) [redmeat]3.arm#c.int_health = 0
( 3) [redmeat]4.arm#c.int_health = 0
chi2( 3) = 4.88
Prob > chi2 = 0.1812
. margins, dydx(arm) at(int_health = (1(1)5)) post vsquish
Conditional marginal effects Number of obs = 3,490
Model VCE : Robust
Expression : Predicted mean redmeat, predict()
dy/dx w.r.t. : 2.arm 3.arm 4.arm
1._at : int_health = 1
2._at : int_health = 2
3._at : int_health = 3
4._at : int_health = 4
5._at : int_health = 5
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.arm | (base outcome)
-------------+----------------------------------------------------------------
2.arm |
_at |
1 | .1429054 .2956879 0.48 0.629 -.4366322 .7224431
2 | -.0480895 .1792539 -0.27 0.788 -.3994208 .3032417
3 | -.2030383 .0995889 -2.04 0.041 -.3982291 -.0078476
4 | -.3272395 .0859704 -3.81 0.000 -.4957384 -.1587405
5 | -.4252858 .1252607 -3.40 0.001 -.6707922 -.1797794
-------------+----------------------------------------------------------------
3.arm |
_at |
1 | -.6448573 .2885458 -2.23 0.025 -1.210397 -.0793178
2 | -.5661724 .1783523 -3.17 0.002 -.9157364 -.2166084
3 | -.4956238 .0981094 -5.05 0.000 -.6879146 -.3033329
4 | -.4324359 .084127 -5.14 0.000 -.5973219 -.26755
5 | -.3759039 .1304803 -2.88 0.004 -.6316406 -.1201673
-------------+----------------------------------------------------------------
4.arm |
_at |
1 | -.7957709 .2993894 -2.66 0.008 -1.382563 -.2089784
2 | -.791002 .1826672 -4.33 0.000 -1.149023 -.4329808
3 | -.7812252 .099314 -7.87 0.000 -.9758771 -.5865733
4 | -.7673363 .0840066 -9.13 0.000 -.9319863 -.6026863
5 | -.7501138 .1282987 -5.85 0.000 -1.001575 -.498653
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
. marginsplot, ylabel(-2(1)2) xlabel(.) xtitle("Interest in health (lowest to highest)") legend(order(4 "WL" 5 "T" 6 "WLT") row(1)) name(healthPOIS, replace)
Variables that uniquely identify margins: int_health _deriv
. contrast _at, atequations vsquish
Contrasts of conditional marginal effects Number of obs = 3,490
Model VCE : Robust
Expression : Predicted mean redmeat, predict()
dy/dx w.r.t. : 2.arm 3.arm 4.arm
1._at : int_health = 1
2._at : int_health = 2
3._at : int_health = 3
4._at : int_health = 4
5._at : int_health = 5
------------------------------------------------
| df chi2 P>chi2
-------------+----------------------------------
1b.arm |
_at | (omitted)
-------------+----------------------------------
2.arm |
_at | 3 33.76 0.0000
-------------+----------------------------------
3.arm |
_at | 3 33.32 0.0000
-------------+----------------------------------
4.arm |
_at | 3 98.37 0.0000
------------------------------------------------
. pwcompare _at, atequations effects vsquish
Pairwise comparisons of conditional marginal effects
Model VCE : Robust Number of obs = 3,490
Expression : Predicted mean redmeat, predict()
dy/dx w.r.t. : 2.arm 3.arm 4.arm
1._at : int_health = 1
2._at : int_health = 2
3._at : int_health = 3
4._at : int_health = 4
5._at : int_health = 5
------------------------------------------------------------------------------
| Contrast Delta-method Unadjusted Unadjusted
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.arm | (base outcome)
-------------+----------------------------------------------------------------
2.arm |
_at |
2 vs 1 | -.190995 .1251792 -1.53 0.127 -.4363418 .0543518
3 vs 1 | -.3459438 .228219 -1.52 0.130 -.7932449 .1013573
4 vs 1 | -.4701449 .3124983 -1.50 0.132 -1.08263 .1423405
5 vs 1 | -.5681912 .3808993 -1.49 0.136 -1.31474 .1783578
3 vs 2 | -.1549488 .1030833 -1.50 0.133 -.3569883 .0470907
4 vs 2 | -.2791499 .1874477 -1.49 0.136 -.6465407 .0882408
5 vs 2 | -.3771962 .2559733 -1.47 0.141 -.8788947 .1245023
4 vs 3 | -.1242011 .0844072 -1.47 0.141 -.2896362 .0412339
5 vs 3 | -.2222474 .1530172 -1.45 0.146 -.5221557 .0776608
5 vs 4 | -.0980463 .0686529 -1.43 0.153 -.2326035 .0365109
-------------+----------------------------------------------------------------
3.arm |
_at |
2 vs 1 | .0786849 .1177887 0.67 0.504 -.1521767 .3095465
3 vs 1 | .1492335 .2196351 0.68 0.497 -.2812433 .5797104
4 vs 1 | .2124214 .3072645 0.69 0.489 -.389806 .8146487
5 vs 1 | .2689534 .3822329 0.70 0.482 -.4802094 1.018116
3 vs 2 | .0705487 .1018483 0.69 0.489 -.1290703 .2701677
4 vs 2 | .1337365 .1894815 0.71 0.480 -.2376405 .5051135
5 vs 2 | .1902685 .2644557 0.72 0.472 -.3280552 .7085922
4 vs 3 | .0631878 .0876352 0.72 0.471 -.108574 .2349497
5 vs 3 | .1197198 .1626134 0.74 0.462 -.1989965 .4384361
5 vs 4 | .056532 .0749802 0.75 0.451 -.0904265 .2034904
-------------+----------------------------------------------------------------
4.arm |
_at |
2 vs 1 | .004769 .1244258 0.04 0.969 -.239101 .248639
3 vs 1 | .0145457 .229435 0.06 0.949 -.4351386 .4642301
4 vs 1 | .0284346 .3174964 0.09 0.929 -.593847 .6507162
5 vs 1 | .0456571 .3907904 0.12 0.907 -.7202779 .8115922
3 vs 2 | .0097768 .1050169 0.09 0.926 -.1960526 .2156062
4 vs 2 | .0236656 .1930936 0.12 0.902 -.3547909 .4021222
5 vs 2 | .0408882 .2664102 0.15 0.878 -.4812662 .5630426
4 vs 3 | .0138889 .0880844 0.16 0.875 -.1587534 .1865311
5 vs 3 | .0311114 .1614165 0.19 0.847 -.2852591 .3474819
5 vs 4 | .0172225 .07334 0.23 0.814 -.1265212 .1609663
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
.
OLS, int_health continuous:
Code:
. regress redmeat arm##c.int_health
Source | SS df MS Number of obs = 3,490
-------------+---------------------------------- F(7, 3482) = 26.58
Model | 520.290212 7 74.3271732 Prob > F = 0.0000
Residual | 9738.00893 3,482 2.79667115 R-squared = 0.0507
-------------+---------------------------------- Adj R-squared = 0.0488
Total | 10258.2991 3,489 2.94018319 Root MSE = 1.6723
----------------------------------------------------------------------------------
redmeat | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
arm |
WL | .207299 .3561368 0.58 0.561 -.4909589 .905557
T | -.6925493 .3646463 -1.90 0.058 -1.407491 .0223929
WLT | -.8232045 .3700983 -2.22 0.026 -1.548836 -.0975729
|
int_health | -.2942289 .0656121 -4.48 0.000 -.4228709 -.1655869
|
arm#c.int_health |
WL | -.1320273 .091971 -1.44 0.151 -.3123498 .0482952
T | .0644555 .0944202 0.68 0.495 -.1206691 .2495801
WLT | .0139352 .0957076 0.15 0.884 -.1737134 .2015838
|
_cons | 4.630014 .2548842 18.17 0.000 4.130276 5.129752
----------------------------------------------------------------------------------
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of redmeat
chi2(1) = 0.57
Prob > chi2 = 0.4511
. testparm arm#c.int_health
( 1) 2.arm#c.int_health = 0
( 2) 3.arm#c.int_health = 0
( 3) 4.arm#c.int_health = 0
F( 3, 3482) = 1.63
Prob > F = 0.1803
. margins, dydx(arm) at(int_health = (1(1)5)) post vsquish
Conditional marginal effects Number of obs = 3,490
Model VCE : OLS
Expression : Linear prediction, predict()
dy/dx w.r.t. : 2.arm 3.arm 4.arm
1._at : int_health = 1
2._at : int_health = 2
3._at : int_health = 3
4._at : int_health = 4
5._at : int_health = 5
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.arm | (base outcome)
-------------+----------------------------------------------------------------
2.arm |
_at |
1 | .0752718 .267286 0.28 0.778 -.4487813 .5993249
2 | -.0567555 .1815697 -0.31 0.755 -.4127492 .2992383
3 | -.1887828 .1068204 -1.77 0.077 -.3982197 .0206542
4 | -.32081 .082286 -3.90 0.000 -.4821437 -.1594763
5 | -.4528373 .138017 -3.28 0.001 -.7234397 -.1822348
-------------+----------------------------------------------------------------
3.arm |
_at |
1 | -.6280937 .2733197 -2.30 0.022 -1.163977 -.0922107
2 | -.5636382 .1851236 -3.04 0.002 -.9265999 -.2006766
3 | -.4991827 .1080193 -4.62 0.000 -.7109703 -.2873951
4 | -.4347272 .083042 -5.24 0.000 -.5975431 -.2719113
5 | -.3702717 .1412591 -2.62 0.009 -.6472306 -.0933127
-------------+----------------------------------------------------------------
4.arm |
_at |
1 | -.8092692 .2774444 -2.92 0.004 -1.353239 -.2652991
2 | -.795334 .1878774 -4.23 0.000 -1.163695 -.4269729
3 | -.7813988 .1092719 -7.15 0.000 -.9956422 -.5671554
4 | -.7674636 .083082 -9.24 0.000 -.9303579 -.6045693
5 | -.7535284 .1420731 -5.30 0.000 -1.032083 -.4749733
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
. marginsplot, ylabel(-2(1)2) xlabel(.) xtitle("Interest in health (lowest to highest)") legend(order(4 "WL" 5 "T" 6 "WLT") row(1)) name(healthOLS, replace)
Variables that uniquely identify margins: int_health _deriv
.

Comment