Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sample size for multivariate logistic regression

    Dear Experts

    First let me apologise for buzzing you with a layman's question.

    I have a hypothesis that the amount of a certain virus in our blood (i.e. viral_load) is associated with the development of a certain illness .

    We coded the development of illness and sex as binary variables.

    A pilot study with a small sample size (n=59) showed that::

    Code:
    . sum viral_load sex illness
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
      viral_load |        59     4.30339    1.557403          0        7.3
             sex |        59    .5254237    .5036396          0          1
         illness |        51    .1764706    .3850134          0          1
    
    . sum sex viral_load if illness == 0
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
             sex |        42     .452381    .5037605          0          1
      viral_load |        42    4.204762    1.593577          0        7.3
    
    . sum sex viral_load if illness == 1
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
             sex |         9    .7777778    .4409586          0          1
      viral_load |         9    4.966667     .952628          4        6.9
    
    . logistic illness sex viral_load
    
    Logistic regression                               Number of obs   =         51
                                                      LR chi2(2)      =       5.35
                                                      Prob > chi2     =     0.0690
    Log likelihood = -21.091878                       Pseudo R2       =     0.1125
    
    ------------------------------------------------------------------------------
         illness | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             sex |   4.314489    3.80492     1.66   0.097     .7660555    24.29957
      viral_load |   1.560802    .518568     1.34   0.180     .8138436    2.993332
           _cons |   .0108567   .0200497    -2.45   0.014     .0002909     .405189
    ------------------------------------------------------------------------------
    The P value for the viral_load was marginal (P=0.18). The above result shows that we have to consider the effect of sex (1 for male, 0 for female in this dataset).

    How can I estimate the sample size to adequately test the hypothesis in a multivariate analysis which considers the effect of sex?
    alpha and beta should be the default.

    I am using Stata 13. Since I am not experienced in programming for Stata, I would like to do the work by a command or a wizard, if possible.

    Your assistance would be appreiated.

    Yoshi



  • #2
    Stata's -power- command was introduced in version 13, and it has expanded its capabilities since then. However, I believe even now it cannot do sample size calculations for a logistic regression with a covariate.

    A more expansive range of sample size calculations, including what is needed for your situation, is available with GPower, from the University of Dusseldorf. You can download it at https://www.psychologie.hhu.de/arbei...hologie/gpower. They do require you to register with them, but they do not spam you--you will only hear from them if there is a new release you can install or a bug fix. It has a GUI that is easy to use. It is not really suitable for professional statisticians, who need to do these calculations for more complicated and exotic study designs, but for someone who does only basic analyses like the one you describe, it is excellent.

    Comment


    • #3
      Dear Clyde, thank you very much for letting me know GP site. I will see the GPower site as soon as possible.

      Comment


      • #4
        I agree that G*Power is very good but want to note another free alternative - SampSize which can be downloaded to a variety of platforms including tablets and smart phones; for link and instructions, see Flight, L and Julious, SA (2022), "A practical guide to sample size calculations: installation of the app SampSize", Pharmaceutical Statistics, 21: 1109-1110; included in the citations are a number of what are basically tutorials

        Comment


        • #5
          Hi Rich, Thank you very much for the useful information.
          Yoshiro

          Comment


          • #6
            Take a look at powerlog.ado at
            Code:
            net from https://stats.oarc.ucla.edu/stat/stata/ado/analysis/
            .

            Comment

            Working...
            X