As per Nick Cox's "request"!
. * Three ways to compute the N-1 Chi-square with Stata.
.
. * First, generate a data set containing the 4 cell counts.
. * I'll use the well-known 2x2 table showing the relationship
. * between type of feeding (breast vs bottle) and malocclusion
. * of the teeth in infants (see Yates, 1934; Kendall & Stuart,
. * 1967; Campbell, 2007).
.
. clear all
. input rowvar colvar N
rowvar colvar N
1. 0 0 4
2. 0 1 16
3. 1 0 1
4. 1 1 21
5. end
. list
+----------------------+
| rowvar colvar N |
|----------------------|
1. | 0 0 4 |
2. | 0 1 16 |
3. | 1 0 1 |
4. | 1 1 21 |
+----------------------+
.
. * METHOD 1.
.
. * Write a small program to compute E.S. Pearson's N-1 Chi-square test
. * using stored results from the 'tabulate' command.
. * Program name: ESPChiSq, short for Egon S. Pearson's N-1 Chi-Square.
. capture program drop ESPChiSq
. quietly program ESPChiSq
.
. * Use tabulate command to compute Pearson's Chi-square.
. tabulate rowvar colvar [fweight = N], chi2
| colvar
rowvar | 0 1 | Total
-----------+----------------------+----------
0 | 4 16 | 20
1 | 1 21 | 22
-----------+----------------------+----------
Total | 5 37 | 42
Pearson chi2(1) = 2.3858 Pr = 0.122
. ESPChiSq
Egon S. Pearson's N-1 Chi-Square Test
N-1 ChiSq df p-value
----------------------------
2.3290418 1 .12698002
----------------------------
.
. * Could also be done using tabi (i.e., immediate form of tabulate).
.
. tabi 4 16 \ 1 21, chi2
| col
row | 1 2 | Total
-----------+----------------------+----------
1 | 4 16 | 20
2 | 1 21 | 22
-----------+----------------------+----------
Total | 5 37 | 42
Pearson chi2(1) = 2.3858 Pr = 0.122
. ESPChiSq
Egon S. Pearson's N-1 Chi-Square Test
N-1 ChiSq df p-value
----------------------------
2.3290418 1 .12698002
----------------------------
.
. * ----------------------------------------------
.
. * METHOD 2.
.
. * Compute a constant stratum variable
. generate Stratum = 0
. list
+--------------------------------+
| rowvar colvar N Stratum |
|--------------------------------|
1. | 0 0 4 0 |
2. | 0 1 16 0 |
3. | 1 0 1 0 |
4. | 1 1 21 0 |
+--------------------------------+
.
. * Use tab3way to display the contingency table.
. tab3way rowvar colvar Stratum [fweight=N] , rowtot coltot
Frequency weights are based on the expression: N
Table entries are cell frequencies
Missing categories ignored
-------------------------------
| Stratum and colvar
| -------- 0 --------
rowvar | 0 1 TOTAL
----------+--------------------
0 | 4 16 20
1 | 1 21 22
TOTAL | 5 37 42
-------------------------------
. * Use the cc command to compute the Mantel-Haenszel statistic & p-value.
. cc rowvar colvar [fweight=N], by(Stratum)
Stratum | OR [95% Conf. Interval] M-H Weight
-----------------+-------------------------------------------------
0 | 5.25 .4440375 270.558 .3809524 (exact)
-----------------+-------------------------------------------------
Crude | 5.25 .4440375 270.558 (exact)
M-H combined | 5.25 .5338913 51.62568
-------------------------------------------------------------------
Test that combined OR = 1:
Mantel-Haenszel chi2(1) = 2.33
Pr>chi2 = 0.1270
.
. * The M-H test above is matching the Linear-by-linear association test
. * from SPSS.
.
. * Now see what happens if the stratification variable is omitted.
. cc row col [fweight=N]
Proportion
| Exposed Unexposed | Total Exposed
-----------------+------------------------+------------------------
Cases | 21 1 | 22 0.9545
Controls | 16 4 | 20 0.8000
-----------------+------------------------+------------------------
Total | 37 5 | 42 0.8810
| |
| Point estimate | [95% Conf. Interval]
|------------------------+------------------------
Odds ratio | 5.25 | .4440375 270.558 (exact)
Attr. frac. ex. | .8095238 | -1.252062 .9963039 (exact)
Attr. frac. pop | .7727273 |
+-------------------------------------------------
chi2(1) = 2.39 Pr>chi2 = 0.1224
. * If I omit the constant Stratum variable, Pearson's Chi-square is computed.
.
. * ----------------------------------------------
.
. * METHOD 3.
.
. * As Howell's notes below show, Mantel's Chi-square for linear trend
. * (aka., the test of linear-by-linear association in SPSS) is equal
. * to Pearson's r-squared * (N-1).
. * https://www.uvm.edu/~dhowell/methods7/Supplements/OrdinalChiSq.html
.
. quietly correlate rowvar colvar [fweight = N]
. * return list
. local Linear = (r(N)-1)*r(rho)^2
. local dfLinear = 1
. display "N-1 Chi-square = " `Linear'
N-1 Chi-square = 2.3290418
. display " p = " chi2tail(1,`Linear')
p = .12698002
. * ----------------------------------------------
capture program drop ESPChiSq quietly program ESPChiSq display "Egon S. Pearson's N-1 Chi-Square Test" display "N-1 ChiSq df p-value" display "----------------------------" display (r(N)-1)/r(N)*r(chi2) " " (r(r)-1)*(r(c)-1) /// " " chi2tail(1,(r(N)-1)/r(N)*r(chi2)) display "----------------------------" end
Comment