Exact confidence intervals proportions

Francis Mueller

Join Date: May 2018
Posts: 3

Exact confidence intervals proportions

25 May 2018, 08:49

Code example

Code:

. csi 0 8 10 10, exact

                 |   Exposed   Unexposed  |      Total
-----------------+------------------------+------------
           Cases |         0           8  |          8
        Noncases |        10          10  |         20
-----------------+------------------------+------------
           Total |        10          18  |         28
                 |                        |
            Risk |         0    .4444444  |   .2857143
                 |                        |
                 |      Point estimate    |    [95% Conf. Interval]
                 |------------------------+------------------------
 Risk difference |        -.4444444       |   -.6739982   -.2148907
      Risk ratio |                0       |           .           .
 Prev. frac. ex. |                1       |           .           .
 Prev. frac. pop |         .3571429       |
                 +-------------------------------------------------
                                  1-sided Fisher's exact P = 0.0141
                                  2-sided Fisher's exact P = 0.0251

According to the help/technical note, confidence intervals for the risk difference are exact - however no method is specified. Which method is used?
Is this method valid for small samples as well?
thank you.

Tags: None

Bruce Weaver

Join Date: May 2014
Posts: 1129

25 May 2018, 09:39

Hi Francis. The equation is shown on p. 45 here:

https://www.stata.com/manuals13/stepitab.pdf

My copy of Robert Newcombe's classic book on CIs for proportions tells me that is the Wald method. As an exercise, I programmed it myself to see if I could duplicate the result from -csi-.

Code:

clear *
input byte(a b c d)
0 8 10 10
end
generate byte n1 = a+c
generate byte n0 = b+d
generate RD = (a/n1) - (b/n0)
generate SE = sqrt(a*c/n1^3 + b*d/n0^3)
generate zcrit = invnormal(.975)
generate lower = RD-zcrit*SE
generate upper = RD+zcrit*SE
list n1 n0 zcrit RD lower upper
csi 0 8 10 10, exact

Output:

Code:

. list n1 n0 zcrit RD lower upper

     +--------------------------------------------------------+
     | n1   n0      zcrit          RD       lower       upper |
     |--------------------------------------------------------|
  1. | 10   18   1.959964   -.4444444   -.6739982   -.2148907 |
     +--------------------------------------------------------+

. csi 0 8 10 10, exact

                 |   Exposed   Unexposed  |      Total
-----------------+------------------------+------------
           Cases |         0           8  |          8
        Noncases |        10          10  |         20
-----------------+------------------------+------------
           Total |        10          18  |         28
                 |                        |
            Risk |         0    .4444444  |   .2857143
                 |                        |
                 |      Point estimate    |    [95% Conf. Interval]
                 |------------------------+------------------------
 Risk difference |        -.4444444       |   -.6739982   -.2148907
      Risk ratio |                0       |           .           .
 Prev. frac. ex. |                1       |           .           .
 Prev. frac. pop |         .3571429       |
                 +-------------------------------------------------
                                  1-sided Fisher's exact P = 0.0141
                                  2-sided Fisher's exact P = 0.0251

You can use Joseph Coveney's -rdci- program (SJ) to get CIs using some other (better) methods. E.g.,

Code:

. rdcii 0 8 10 10

Confidence intervals for risk difference

       Risk for unexposed (p0): 0.444
         Risk for exposed (p1): 0.000
     Risk difference (p1 - p0): -0.444

-----------------------------------------
         Method      [95% Conf. Interval]
-----------------------------------------
     Agresti-Caffo   -0.635        -0.098
Newcombe Method 10   -0.663        -0.103
       Wallenstein   -0.663        -0.110
Miettinen-Nurminen        .             .

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Francis Mueller

Join Date: May 2018

Posts: 3
#3

28 May 2018, 04:19

Hi Bruce
thank you very much for the quick answer.
Hence, the method is valid only for large samples. Even though -rdcii- provides better options, I found nothing comparable to the r-project package ExactCIdiff, do you agree? This would be appropriate for small samples.
thanks again.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1129
#4

28 May 2018, 07:16

Hello Francis. I have not looked real hard, but I have not (so far) found any user-written programs to compute exact CIs for the risk difference.

Bear in mind that exact CIs are only "better" than approximate CIs when your criterion is that the coverage probability must be at least (1-alpha)*100%. When that is what you demand, the actual coverage can be quite a bit higher than the nominal value, especially when samples are small.

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment
Francis Mueller

Join Date: May 2018

Posts: 3
#5

28 May 2018, 08:17

Thanks Bruce.
Did I interpret your answer correctly: for any alpha |LB(exact)| <= |LB(approx)| and the same for the upper bound of the CI? I had the impression, that the Wald method can over/underestimate the CI for small samples?
thanks
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1129
#6

28 May 2018, 09:19

Hi Francis. I'll point to the first paragraph of Section 5 (Conclusion and Extensions) in Agresti & Coull (1998). They say it much better than I ever could.
http://users.stat.ufl.edu/~aa/articl...coull_1998.pdf

Notice that they are not claiming statistical proof, they are merely expressing a preference for approximate CIs under ordinary circumstances: "For most applications, we prefer the latter" (emphasis added). But see also the final paragraph of the article:

Exact inference has an important place in statistical inference of discrete data, in particular for sparse contingency table problems for which large-sample chi-squared statistics are often unreliable. However, approximate results are sometimes more useful than exact results, because of the inherent conservativeness of exact methods.

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment

Announcement

Exact confidence intervals proportions

Comment

Comment

Comment

Comment

Comment