Multilevel modelling with melogit and meqrlogit

Yen Chi Nguyen

Join Date: Sep 2017

Posts: 6
#1

Multilevel modelling with melogit and meqrlogit

20 Sep 2017, 09:21

Dear Statalists,

I have to do multi-level modelling to analysis data from a cross-sectional survey for my Master thesis, and I do hope you could enlighten me with it as it's first time I've ever worked with multi-level modelling, and I also am beginner with statistical analysis.

Background on my dataset:
I collected health data from 851 children from 668 households, within 8 study sites. The background data on households are exactly the same for children living in the same house ( as we interviewed their care-taker); however, separate data on each child's health is collected.

Outcome variable is dichotomous 0-no disease/ 1- having disease.

So I choose to do multi-level mixed effects logistic regression to predict odds ratio (OR) with a specific risk factors, adjusted to several potential confounders. However, there are two different command for it as melogit and meprlogit, and I neither don't understand the different between them nor in which circumstances that melogit/ meqrlogit preferred to use.

I would be very appreciated if you could help me to understand:
Whether the household should be considered as 2-level variable as the cluster size is really small ( mean =1.2; max =5) and there are many clusters (N=668 househods)

What is the difference when using melogit and meqrlogit? and how do I know which one should use?

Which steps should I follow to build an multi-level models?

Many thanks,
Chi
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30014
#2

20 Sep 2017, 10:27

Originally posted by Yen Chi Nguyen View Post

Dear Statalists,

I would be very appreciated if you could help me to understand:[LIST][*]Whether the household should be considered as 2-level variable as the cluster size is really small ( mean =1.2; max =5) and there are many clusters (N=668 househods)

Yes. The number of clusters is quite large, which means that the variance at that level will be adequately sampled. Even though the size of the clusters is small, the ICC might still be very large in this kind of study. I would certainly include a household level in this analysis. If it becomes problematic for computational reasons, rather than eliminating the level from the model, I would sooner reduce the sample by selecting randomly a single child from each household.

[*]What is the difference when using melogit and meqrlogit? and how do I know which one should use?

There is no difference in the model they estimate. They use different numerical methods to estimate the model parameters. Ideally, both methods will give the same results. The reason both exist is that the likelihood functions of multi-level mixed effects logistic models are often badly behaved and maximization becomes difficult. Different approaches to the calculations work better in different types of likelihoods. So Stata provides two different approaches. I think StataCorp recommends trying -melogit- first. If convergence difficulties arise, and after first making any necessary "repairs" to the model (e.g. eliminating any level where the estimated variance is very close to zero) convergence difficulties persist, then try -meqrlogit-. The latter will often converge when -melogit- fails.

Another difference between them is that -melogit- is supported by the -svy:- prefix, whereas -meqrlogit- is not. Also, -melogit- allows weights, but -meqrlogit- does not.

[*]Which steps should I follow to build an multi-level models?

This is a very broad question that probably is not suitable for a Forum of this type. As you are doing this for a Master's thesis, ideally your thesis advisor would either point the way or introduce you to a statistician who could provide help. I realize that in reality many master's degree programs do not provide all the support they should. But, at the end of the day, this is what you are paying tuition for and you should demand that you receive that level of support. As an adjunct, I highly recommend the on-line multi-level modeling course offered by Bristol University. http://www.bristol.ac.uk/cmm/learning/online-course/. The content is tailored to your level of knowledge coming in, and the course is offered in both Stata and R versions.
2 likes
Comment

Yen Chi Nguyen

Join Date: Sep 2017
Posts: 6

24 Sep 2017, 08:18

Thank you so much, Clyde Schechter! You saved my day!

There is no difference in the model they estimate. They use different numerical methods to estimate the model parameters. Ideally, both methods will give the same results.

When I started running the command melogit as below, it shows more interactions than using meqrlogit, And the results from 2 commands are different. Could you please advise me which one I should use?

melogit c_dia i.facility || ID:

Integration method: ghermite Integration pts. = 7

Wald chi2(3) = 6.17

Log likelihood = -313.38716 Prob > chi2 = 0.1037

c_dia Coef. Std. Err. z P>z [95% Conf. Interval]

toi_used_group

1 -4.036806 2.802534 -1.44 0.150 -9.529672 1.45606

2 -2.092711 .9337816 -2.24 0.025 -3.922889 -.2625326

3 -.6962276 2.603322 -0.27 0.789 -5.798645 4.40619

_cons -37.64706 6.54418 -5.75 0.000 -50.47341 -24.8207

var(_cons) 2177.092 767.5901 1090.847 4345.001

LR test vs. logistic model: chibar2(01) = 53.16 Prob >= chibar2 = 0.0000

Note: The above coefficient values are the result of non-adaptive quadrature

because the adaptive parameters could not be computed.

megrlogit c_dia i.facility || ID:


c_dia Coef. Std. Err.	z P>z [95% Conf. Interval]

toi_used_group
1 -.6039362 .4662919	-1.30 0.195 -1.517852 .3099791
2 -1.941517 .5759072	-3.37 0.001 -3.070274 -.8127594
3 -.6603995 1.400095	-0.47 0.637 -3.404536 2.083737
_cons -2.215967 .5238236	-4.23 0.000 -3.242642 -1.189292


Random-effects Parameters Estimate	Std. Err. [95% Conf. Interval]

ID: Identity
var(_cons) 6.212588	3.507884 2.054229 18.78868

LR test vs. logistic model: chibar2(01) =	11.61 Prob >= chibar2 = 0.0003

If convergence difficulties arise, and after first making any necessary "repairs" to the model (e.g. eliminating any level where the estimated variance is very close to zero) convergence difficulties persist, then try -meqrlogit-.

When you said " convergence difficuties arise", does it mean the model keep saying " not concave" when showing hundreds interactions , like this?

Iteration 320: log likelihood = -311.82476 (not concave)

And could you please advise me on how do I now if the estimated variance is close to zero?

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30014
#4

24 Sep 2017, 09:57

My first thought is that one of them converged and the other did not. It is OK to see outputs like "Iteration 320: log likelihood = -311.82476 (not concave)" along the way. But if the final iteration says that, then your estimation did not converge. (The same applies to "backed up".)

If one of these converged and the other did not, then I would trust the one that converged and discard the other. If neither converged, then neither set of results is valid.
If both converged, my second thought would be that they were not run on the same data.

And could you please advise me on how do I now if the estimated variance is close to zero?
[/quote]
You just look at them! Certainly in the outputs you are showing, the variance components are nowhere near zero.

In the future, please post Stata output using code delimiters, not HTML tables. What you have posted is poorly aligned and difficult to read. Using code delimiters produces a neatly aligned result in a fixed-width font. If you do not know how to use code delimiters, read FAQ #12 for instructions.
1 like
Comment

Announcement