Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sample size for specificty/sensitivity

    Hi all,
    I would like to conduct a study to investigate the potential use of a protein (continuous) in saliva as a biomarker to predict a disease (yes or no). I could not find in stata how to determine the sample size for let us say 0.8 or 0.9 specificity/sensitivity.
    I would be grateful for any help.

    Regards

    stata 15.1 On Mac

  • #2
    Hello Abdelilah. I think you need to take the approach described on this webpage--although I would use the Wilson method for computing the CIs rather than the Wald method that is shown. Here is some code for the example given on that page:

    Code:
    // Example from http://www.pmean.com/99/diag.html
    
    // N=50, sensitivity = 75%
    // 75% of 50 = 37.5, so estimate CI using both 37 and 38
    cii proportions 50 37, wald
    display "Margin of error = " r(ub)-r(proportion)
    cii proportions 50 38, wald
    display "Margin of error = " r(ub)-r(proportion)
    
    // I would use the Wilson method
    cii proportions 50 37, wilson
    display "Margin of error = " r(ub)-r(proportion)
    cii proportions 50 38, wilson
    display "Margin of error = " r(ub)-r(proportion)
    
    // N=50, specificity = 90%
    cii proportions 50 45, wald
    display "Margin of error = " r(ub)-r(proportion)
    cii proportions 50 45, wilson
    display "Margin of error = " r(ub)-r(proportion)
    If the margin of error is larger than desired, repeat with a larger N until the margin of error is acceptable.

    Notice that in Steve Simon's example, it appears that diseased and non-diseased samples were drawn separately in order to have equal numbers in each group. Another way would be to draw one random sample from the population and then sort them into diseased and non-diseased. If you are doing the latter, you'll have to use values of N for sensitivity and specificity that reflect properly the prevalence of disease in the population.

    HTH.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 18.5 (Windows)

    Comment


    • #3
      You realize, right, that you can get any value of specificity you want just by setting the threshold arbitrarily high, even if the receiver operating characteristic curve for the assay lies on the diagonal? Likewise, for sensitivity by setting the threshold arbitrarily low.

      For scoping out sample size, you might be better off with some kind of assessment of the regression coefficient in a logistic model instead of a supposed specificity value or supposed sensitivity value. Once you have data in hand that show the salivary level of the protein to have some nonnegligible predictive value for the disease, you can then follow-up with investigations into calibration and discrimination in the context of the prevalence of the disease in the population of patients for whom the candidate clinical laboratory test is intended.

      Comment

      Working...
      X