Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why might Clopper Pearson CIs differ in Stata vs SAS output?

    Hello,

    I am currently using Stata to replicate an analysis with complex svy data to obtain weighted prevalence estimates and Clopper Pearson (exact) confidence intervals by demographic characteristics for data validation purposes. The original analysis was carried out in SAS-callable Sudaan. I noticed that the CIs are off by varying degrees, while all other output matches exactly, including prevalence estimates, as well as PRs and their corresponding CIs.

    My question is: why might the resulting CIs for proportions differ somewhat between Stata and SAS output? The difference seems to be greater when there are a smaller number of observations in the subpopulation.

    Here is an example of the Stata code I am using to obtain prevalence estimates and exact CIs for parent physical health by child's age (dichotomised):

    Code:
    svy, subpop(if include == 1 & age == 0): prop parent_physhealth, citype(exact)
    svy, subpop(if include == 1 & age == 1): prop parent_physhealth, citype(exact)
    Thank you so much in advance,
    Helena
    Last edited by Helena Hutchins; 09 Mar 2023, 14:40.

  • #2
    The binomial method is discussed for PROC FREQ in the SAS documentation here, which agrees with the method used by Stata (-help ci-). I can also confirm that some simple tests show the CIs match. That's about the extent of investigating I'm willing to do, since you are not comparing SAS per se, but SUDAAN and survey methods. I have no knowledge of SUDAAN so you can poke around in that documentation if you wish. Perhaps it's a difference of SUDAAN using a finite-population correction, I can't say.

    Comment


    • #3
      Not the question, but the common label exact is a misnomer here. Clopper-Pearson intervals are wasteful. For the detail see e.g. https://projecteuclid.org/journals/s...009213286.full

      Comment

      Working...
      X