Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • sample size for multivariate logistic regression

    Dear Experts

    First let me apologise for buzzing you with a layman's question.

    I have a hypothesis that the amount of a certain virus in our blood (i.e. viral_load) is associated with the development of a certain illness .

    We coded the development of illness and sex as binary variables.

    A pilot study with a small sample size (n=59) showed that::

    Code:
    . sum viral_load sex illness
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
      viral_load |        59     4.30339    1.557403          0        7.3
             sex |        59    .5254237    .5036396          0          1
         illness |        51    .1764706    .3850134          0          1
    
    . sum sex viral_load if illness == 0
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
             sex |        42     .452381    .5037605          0          1
      viral_load |        42    4.204762    1.593577          0        7.3
    
    . sum sex viral_load if illness == 1
    
        Variable |       Obs        Mean    Std. Dev.       Min        Max
    -------------+--------------------------------------------------------
             sex |         9    .7777778    .4409586          0          1
      viral_load |         9    4.966667     .952628          4        6.9
    
    . logistic illness sex viral_load
    
    Logistic regression                               Number of obs   =         51
                                                      LR chi2(2)      =       5.35
                                                      Prob > chi2     =     0.0690
    Log likelihood = -21.091878                       Pseudo R2       =     0.1125
    
    ------------------------------------------------------------------------------
         illness | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             sex |   4.314489    3.80492     1.66   0.097     .7660555    24.29957
      viral_load |   1.560802    .518568     1.34   0.180     .8138436    2.993332
           _cons |   .0108567   .0200497    -2.45   0.014     .0002909     .405189
    ------------------------------------------------------------------------------
    The P value for the viral_load was marginal (P=0.18). The above result shows that we have to consider the effect of sex (1 for male, 0 for female in this dataset).

    How can I estimate the sample size to adequately test the hypothesis in a multivariate analysis which considers the effect of sex?
    alpha and beta should be the default.

    I am using Stata 13. Since I am not experienced in programming for Stata, I would like to do the work by a command or a wizard, if possible.

    Your assistance would be appreciated.

    Yoshi



  • #2
    Yoshi:
    1) you probably want to run a multiple (and not a multivariate, that implies >1 dependent variable) logistic regression;
    2) you should probably interact your predictors (I assume that in your research sex is a n-level categorical varable):
    logistic illness i.sex##c.viral_load
    3) see https://blog.stata.com/2019/08/13/ca...ic-regression/
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment

    Working...
    X