sample size for multivariate logistic regression

Yoshiro Nagao

Join Date: Feb 2018
Posts: 24

sample size for multivariate logistic regression

04 Feb 2023, 17:42

Dear Experts

First let me apologise for buzzing you with a layman's question.

I have a hypothesis that the amount of a certain virus in our blood (i.e. viral_load) is associated with the development of a certain illness .

We coded the development of illness and sex as binary variables.

A pilot study with a small sample size (n=59) showed that::

Code:

. sum viral_load sex illness

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
  viral_load |        59     4.30339    1.557403          0        7.3
         sex |        59    .5254237    .5036396          0          1
     illness |        51    .1764706    .3850134          0          1

. sum sex viral_load if illness == 0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         sex |        42     .452381    .5037605          0          1
  viral_load |        42    4.204762    1.593577          0        7.3

. sum sex viral_load if illness == 1

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         sex |         9    .7777778    .4409586          0          1
  viral_load |         9    4.966667     .952628          4        6.9

. logistic illness sex viral_load

Logistic regression                               Number of obs   =         51
                                                  LR chi2(2)      =       5.35
                                                  Prob > chi2     =     0.0690
Log likelihood = -21.091878                       Pseudo R2       =     0.1125

------------------------------------------------------------------------------
     illness | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         sex |   4.314489    3.80492     1.66   0.097     .7660555    24.29957
  viral_load |   1.560802    .518568     1.34   0.180     .8138436    2.993332
       _cons |   .0108567   .0200497    -2.45   0.014     .0002909     .405189
------------------------------------------------------------------------------

The P value for the viral_load was marginal (P=0.18). The above result shows that we have to consider the effect of sex (1 for male, 0 for female in this dataset).

How can I estimate the sample size to adequately test the hypothesis in a multivariate analysis which considers the effect of sex?
alpha and beta should be the default.

I am using Stata 13. Since I am not experienced in programming for Stata, I would like to do the work by a command or a wizard, if possible.

Your assistance would be appreiated.

Yoshi

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

05 Feb 2023, 11:03

Stata's -power- command was introduced in version 13, and it has expanded its capabilities since then. However, I believe even now it cannot do sample size calculations for a logistic regression with a covariate.

A more expansive range of sample size calculations, including what is needed for your situation, is available with GPower, from the University of Dusseldorf. You can download it at https://www.psychologie.hhu.de/arbei...hologie/gpower. They do require you to register with them, but they do not spam you--you will only hear from them if there is a new release you can install or a bug fix. It has a GUI that is easy to use. It is not really suitable for professional statisticians, who need to do these calculations for more complicated and exotic study designs, but for someone who does only basic analyses like the one you describe, it is excellent.
3 likes
Comment
Yoshiro Nagao

Join Date: Feb 2018

Posts: 24
#3

05 Feb 2023, 18:51

Dear Clyde, thank you very much for letting me know GP site. I will see the GPower site as soon as possible.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#4

06 Feb 2023, 13:24

I agree that G*Power is very good but want to note another free alternative - SampSize which can be downloaded to a variety of platforms including tablets and smart phones; for link and instructions, see Flight, L and Julious, SA (2022), "A practical guide to sample size calculations: installation of the app SampSize", Pharmaceutical Statistics, 21: 1109-1110; included in the citations are a number of what are basically tutorials
2 likes
Comment
Yoshiro Nagao

Join Date: Feb 2018

Posts: 24
#5

07 Feb 2023, 22:59

Hi Rich, Thank you very much for the useful information.
Yoshiro
Comment
Dimitriy V. Masterov

Join Date: Mar 2014

Posts: 609
#6

08 Feb 2023, 17:52

Take a look at powerlog.ado at

Code:

net from https://stats.oarc.ucla.edu/stat/stata/ado/analysis/

.
Comment

Announcement

sample size for multivariate logistic regression

Comment

Comment

Comment

Comment

Comment