Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing results of mixed-effects logistic regression with crossed random effects: Stata and R (lme4)

    I have created a mixed-effects logistic model with crossed random effects for my data in both Stata and R (using lme4). The two models, presented below, have yielded very similar estimates for the coefficients, but it appears my method of calculating standard errors and p-values in R differs from Stata's method. Could anyone shed any light on the discrepancy between these two models?

    Stata:

    Code:
    .melogit A B i.C D || _all: R.E || F:
    
    Mixed-effects logistic regression               Number of obs      =       553
    
    -----------------------------------------------------------
                              |   No. of       Observations per Group
     Group Variable |   Groups    Minimum    Average    Maximum
    ----------------+------------------------------------------
                 _all |        1        553      553.0        553
                    F |       23         14       24.0         40
    -----------------------------------------------------------
    
    Integration method:     laplace
    
                                                    Wald chi2(3)       =     29.21
    Log likelihood = -337.64555                     Prob > chi2        =    0.0000
    --------------------------------------------------------------------------------------
                          A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------------+----------------------------------------------------------------
                          B |   .0521762   .0259335     2.01   0.044     .0013474    .1030049
                       1.C |  -2.563292   .7732082    -3.32   0.001    -4.078753   -1.047832
                          D |  -4.070081   1.131891    -3.60   0.000    -6.288546   -1.851616
                   _cons |   2.693203   1.467959     1.83   0.067    -.1839439    5.570349
    ---------------------+----------------------------------------------------------------
    _all>E                |
               var(_cons)|   .9765646   .3263157                      .5073117    1.879867
    ---------------------+----------------------------------------------------------------
    F                        |
               var(_cons)|   .0615515   .0880975                      .0037233    1.017531
    --------------------------------------------------------------------------------------
    LR test vs. logistic regression:     chi2(2) =    48.95   Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.


    R (lme4):

    Code:
    > my.glmm = glmer(A ~ B + D + C + (1|E) + (1|F), data = my.data, family = 'binomial')
    > my.glmm
    Generalized linear mixed model fit by maximum likelihood (Laplace
      Approximation) [glmerMod]
     Family: binomial  ( logit )
    Formula: A ~ B + D + C + (1 | E) + (1 |  
        F)
       Data: my.data
          AIC       BIC    logLik  deviance  df.resid 
     687.2924  713.1846 -337.6462  675.2924       547 
    Random effects:
     Groups Name        Std.Dev.
     E  (Intercept) 0.9876  
     F   (Intercept) 0.2479  
    Number of obs: 553, groups:  E, 44; F, 23
    Fixed Effects:
           (Intercept)                 B                D                  C  
               2.69337             0.05206            -4.06681            -2.55907  
    > coefs <- data.frame(coef(summary(my.glmm)))
    > coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))
    Error in abs(coefs$t.value) : 
      non-numeric argument to mathematical function
    > coefs
                       Estimate      Std..Error      z.value     Pr...z..
    (Intercept)       2.69337259 2.16447149  1.244356 0.2133686532
    B                 0.05206151 0.02424857  2.146992 0.0317938861
    D                -4.06681259 2.13085477 -1.908536 0.0563220015
    C                -2.55906567 0.74622222 -3.429361 0.0006050034


    I know that I am using a roundabout way to get z scores and p-values for the coefficients in R, but it would seem that Stata's method must somehow be different. I'm hoping that I've specified the model correctly in both programs. Are there some estimation methods or assumptions that differ between Stata and R for this type of model? Thank you!
Working...
X