Dear Statalisters,
I noticed major differences between -metan- and -cc- for the Breslow-Day test of homogeneity of odds ratio. Apparently, major differences appear when there are some cells from the 2x2 table with zero events. Please, see the code below:
a = events among exposed persons
b = non-events among exposed persons
c = events among non-exposed persons
d = non-events among non-exposed persons
If I replace both 0's by any integer >=1, both -cc- and -metan- give nearly identical results. However, with zeros, results are greatly different. Hence, my question is: Which one should we trust for sparse data?
All the best,
Tiago
I noticed major differences between -metan- and -cc- for the Breslow-Day test of homogeneity of odds ratio. Apparently, major differences appear when there are some cells from the 2x2 table with zero events. Please, see the code below:
a = events among exposed persons
b = non-events among exposed persons
c = events among non-exposed persons
d = non-events among non-exposed persons
Code:
clear input a b c d a b c d 2 43 4 40 2 35 0 37 3 36 0 39 4 34 2 36 5 40 3 42 3 45 2 46 end metan a b c d, or breslow nograph local bd1 = r(het) rename a counts1 rename b counts2 rename c counts3 rename d counts4 gene strata = _n qui reshape long counts, i(strata) j(j) gene exposure = cond(j==1|j==2,1,0) drop j gene event = cond(mod(_n,2),1,0) qui expand counts cc event exposure, by(strata) bd woolf local bd2 = r(chi2_bd) */"Breslow-Day's chi-square from -metan-" dis `bd1' */ "Breslow-Day's chi-square from -cc-" dis `bd2'
All the best,
Tiago
Comment