Differences between two ways of specifying random effects in melogit

Bei Chang

Join Date: Jul 2015

Posts: 7
#1

Differences between two ways of specifying random effects in melogit

01 Jul 2015, 12:32

According to the documentation of melogit there are 2 ways to specify random effects in a mixed effects logistic regression model:
for random coefficients and intercepts
levelvar: [varlist] [, re_options]

for random effects among the values of a factor variable
levelvar: R.varname

I have used the dataset http://www.stata-press.com/data/r13/bangladesh to fit a mixed effects model to estimate the effect of urban (yes/no) on the use of contraception (c_use), allowing the effect to vary by district (i.e., a random slope model) using the following command:
melogit c_use urban || district:urban, noconstant
the default integration method is mean-variance adaptive Gauss-Hermite quadrature (mvaghermite) with 7 integration points. The estimates of the variance of the random effect and the fixed effect of urban I obtained from this command are different from fitting the same random slope model in R and SAS with the same integration methods. However when I used the following command the results from the three software package are exactly the same:
melogit c_use urban || district:R.urban
I will appreciate any helps to explain the different results I got in using these two commands. Also, why only the second command produced the same results as R and SAS.
Thank you.
Bei Chang
Tags: None
Scott Baldwin

Join Date: Apr 2014

Posts: 15
#2

01 Jul 2015, 21:30

Without seeing your SAS or R code, I can't know for sure. However, I suspect the differences you are seeing is that in SAS and R you've treated --urban-- like a factor variable. SAS calls them class variables.

The Stata code:

Code:

melogit c_use urban || district:urban, noconstant

treats --urban-- like a numeric value. In essence, you are fitting a random slope for urban but no intercept.

The Stata code:

Code:

melogit c_use urban || district:R.urban

treats --urban-- like a factor variable. To replicate the

Code:

R.urban

syntax manually, you can do the following:

Code:

tabulate urban, gen(dummy) melogit c_use urban || district:dummy1 dummy2, noconstant cov(id)

Thus, you are fitting a random effect for each level of --urban-- within --district--. The

Code:

cov(id)

option constrains the random effect variances to be equal.

Best,
Scott
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#3

07 Jul 2015, 14:14

Scott:
Thank you so much for your reply with helpful answers to my questions. I wish I have a way to be notified when someone replied to my question so I will see it right away. You are correct that SAS and R treated urban as a categorical/class variable so to duplicate the same random effect in Stata I should use R.urban or the way you showed. But fitting a random effect for each level of urban across/within district is not the same as fitting a random urban effect across district (i.e., the effect of urban varies across different district), correct?
Thank you again for your help.
Bei
Comment
Scott Baldwin

Join Date: Apr 2014

Posts: 15
#4

07 Jul 2015, 20:33

Hi Bei,

Yes, if I understand you correctly, those are two different things. The random effect of the factor, as specified above, is a random intercept for urban. The way you specified it before, it is a random slope for urban (describing district specific effects in the relationship between urban and the outcome, although you suppressed the overall district random intercept so the interpretation of the random slope is more challenging, but I think you get the idea).

Best,
Scott
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#5

15 Jul 2015, 10:21

Hi Scott:
Thank you again for your responses, which make perfectly sense to me. Now, my problem is that using the codes in SAS (see below) to run the random slope model do not give me the same results as what I got from Stata. If you happen to know SAS maybe you can tell whether these codes are fitting the same random slope model as the one fitted is Stata using the codes (melogit c_use urban || district:urban, noconstant).

proc glimmix data=women.contraception method=quad (qpoints=7);
class use urban;
model use=urban /solution cl or;
random urban/subject=district;
title2 Using PROC GLIMMIX: Random effect is DISTRICT, estimation method is QUAD 7 QPoints;

Thank you so much for your help.
Bei
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4496
#6

15 Jul 2015, 14:52

Bei, it would help those of us who do not know SAS at all if you were to give us some help by adding some comments telling us what this code is doing; then we might be able to translate your English into Stata code that does the same thing
Comment
Scott Baldwin

Join Date: Apr 2014

Posts: 15
#7

15 Jul 2015, 23:10

The problem is the same as before. In your glimmix code, urban is a class variable (factor variable in Stata notation) and in your melogit code, urban is a numeric variable. Try dropping urban from the class line in glimmix and see if it gives you the same results as melogit.

Scott
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#8

17 Jul 2015, 20:46

Hi Scott:
Thank you for your suggestions. I made urban to be a numerical variable and remove it from class statement. The results are now exactly the same as those from the STATA codes. This is certainly not intuitive because urban is a categorical variable with two values yes and no to indicate urban or not. I have to create a corresponding numerical variable to have two values 1 and 0 and used this numerical variable in the Model statement. Quite confusion indeed. Thank you for all your help. Now I would like to figure out the code to fit the exact same model in R. If you happen to know the codes and would share it I will greatly appreciate it.
Thank you.
Bei
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#9

17 Jul 2015, 21:00

Hi Rich:
As I explained in my first post, I am trying to fit a so called 'random slope' model for the effect of urban (a categorical variable with 'yes' or 'no' value) across 'district' on a binary outcome (also with two values yes or no). It turns out that I will need to treat urban as a numerical variable to fit a random slope model. If I treat 'urban' as a categorical/factor/class variable then the software (both STATA and SAS) will be fitting a random interaction (urban interact with district) effect model rather than a random slope (urban effect across district) model. Do you agree?
Thank you for your reply.
Bei
Comment
Scott Baldwin

Join Date: Apr 2014

Posts: 15
#10

17 Jul 2015, 22:30

In R, make sure urban is numeric (use the is.numeric() function). Then with lme4, you can specify the random effects as (-1 + urban | district). I've assumed you have good reason for fitting this model (random slope without a random intercept). Most of the time this is not advisable.

Best,
Scott
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4496
#11

18 Jul 2015, 05:12

hi Bei,
no, you do not need to treat urban as quantitative; please see the help for "fvvarlist"
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#12

18 Jul 2015, 19:58

Hi Scott:
You are right about R. I had to create a numerical variable for urban as I did in SAS and used it in the random effect part of the code as you indicated. The results are now the same as those using melogit with || distract:urban in Stata. What I trying to do is to compare different software packages in fitting the same random effects model using the same estimation method to see whether the results are the same. I tried random slope model and got different results among the three software. Not I know the reason is because that I did not use the right codes. Thank your again for your help.
Comment
Bei Chang

Join Date: Jul 2015

Posts: 7
#13

18 Jul 2015, 20:00

Hi Rich:
Thank you for pointing me to the help documentation about how to use factor variables in Stata.
Comment

Announcement

Differences between two ways of specifying random effects in melogit

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment