Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multilevel modelling with melogit and meqrlogit

    Dear Statalists,

    I have to do multi-level modelling to analysis data from a cross-sectional survey for my Master thesis, and I do hope you could enlighten me with it as it's first time I've ever worked with multi-level modelling, and I also am beginner with statistical analysis.

    Background on my dataset:
    • I collected health data from 851 children from 668 households, within 8 study sites. The background data on households are exactly the same for children living in the same house ( as we interviewed their care-taker); however, separate data on each child's health is collected.
    • Outcome variable is dichotomous 0-no disease/ 1- having disease.
    So I choose to do multi-level mixed effects logistic regression to predict odds ratio (OR) with a specific risk factors, adjusted to several potential confounders. However, there are two different command for it as melogit and meprlogit, and I neither don't understand the different between them nor in which circumstances that melogit/ meqrlogit preferred to use.

    I would be very appreciated if you could help me to understand:
    • Whether the household should be considered as 2-level variable as the cluster size is really small ( mean =1.2; max =5) and there are many clusters (N=668 househods)
    • What is the difference when using melogit and meqrlogit? and how do I know which one should use?
    • Which steps should I follow to build an multi-level models?
    Many thanks,
    Chi


  • #2
    Originally posted by Yen Chi Nguyen View Post
    Dear Statalists,

    I would be very appreciated if you could help me to understand:[LIST][*]Whether the household should be considered as 2-level variable as the cluster size is really small ( mean =1.2; max =5) and there are many clusters (N=668 househods)
    Yes. The number of clusters is quite large, which means that the variance at that level will be adequately sampled. Even though the size of the clusters is small, the ICC might still be very large in this kind of study. I would certainly include a household level in this analysis. If it becomes problematic for computational reasons, rather than eliminating the level from the model, I would sooner reduce the sample by selecting randomly a single child from each household.

    [*]What is the difference when using melogit and meqrlogit? and how do I know which one should use?
    There is no difference in the model they estimate. They use different numerical methods to estimate the model parameters. Ideally, both methods will give the same results. The reason both exist is that the likelihood functions of multi-level mixed effects logistic models are often badly behaved and maximization becomes difficult. Different approaches to the calculations work better in different types of likelihoods. So Stata provides two different approaches. I think StataCorp recommends trying -melogit- first. If convergence difficulties arise, and after first making any necessary "repairs" to the model (e.g. eliminating any level where the estimated variance is very close to zero) convergence difficulties persist, then try -meqrlogit-. The latter will often converge when -melogit- fails.

    Another difference between them is that -melogit- is supported by the -svy:- prefix, whereas -meqrlogit- is not. Also, -melogit- allows weights, but -meqrlogit- does not.

    [*]Which steps should I follow to build an multi-level models?
    This is a very broad question that probably is not suitable for a Forum of this type. As you are doing this for a Master's thesis, ideally your thesis advisor would either point the way or introduce you to a statistician who could provide help. I realize that in reality many master's degree programs do not provide all the support they should. But, at the end of the day, this is what you are paying tuition for and you should demand that you receive that level of support. As an adjunct, I highly recommend the on-line multi-level modeling course offered by Bristol University. http://www.bristol.ac.uk/cmm/learning/online-course/. The content is tailored to your level of knowledge coming in, and the course is offered in both Stata and R versions.


    Comment


    • #3
      Thank you so much, Clyde Schechter! You saved my day!

      There is no difference in the model they estimate. They use different numerical methods to estimate the model parameters. Ideally, both methods will give the same results.
      When I started running the command melogit as below, it shows more interactions than using meqrlogit, And the results from 2 commands are different. Could you please advise me which one I should use?

      melogit c_dia i.facility || ID:
      Integration method: ghermite Integration pts. = 7
      Wald chi2(3) = 6.17
      Log likelihood = -313.38716 Prob > chi2 = 0.1037
      c_dia Coef. Std. Err. z P>z [95% Conf. Interval]
      toi_used_group
      1 -4.036806 2.802534 -1.44 0.150 -9.529672 1.45606
      2 -2.092711 .9337816 -2.24 0.025 -3.922889 -.2625326
      3 -.6962276 2.603322 -0.27 0.789 -5.798645 4.40619
      _cons -37.64706 6.54418 -5.75 0.000 -50.47341 -24.8207
      ID
      var(_cons) 2177.092 767.5901 1090.847 4345.001
      LR test vs. logistic model: chibar2(01) = 53.16 Prob >= chibar2 = 0.0000
      Note: The above coefficient values are the result of non-adaptive quadrature
      because the adaptive parameters could not be computed.

      megrlogit c_dia i.facility || ID:
      c_dia Coef. Std. Err. z P>z [95% Conf. Interval]
      toi_used_group
      1 -.6039362 .4662919 -1.30 0.195 -1.517852 .3099791
      2 -1.941517 .5759072 -3.37 0.001 -3.070274 -.8127594
      3 -.6603995 1.400095 -0.47 0.637 -3.404536 2.083737
      _cons -2.215967 .5238236 -4.23 0.000 -3.242642 -1.189292
      Random-effects Parameters Estimate Std. Err. [95% Conf. Interval]
      ID: Identity
      var(_cons) 6.212588 3.507884 2.054229 18.78868
      LR test vs. logistic model: chibar2(01) = 11.61 Prob >= chibar2 = 0.0003

      If convergence difficulties arise, and after first making any necessary "repairs" to the model (e.g. eliminating any level where the estimated variance is very close to zero) convergence difficulties persist, then try -meqrlogit-.
      When you said " convergence difficuties arise", does it mean the model keep saying " not concave" when showing hundreds interactions , like this?

      Iteration 320: log likelihood = -311.82476 (not concave)

      And could you please advise me on how do I now if the estimated variance is close to zero?

      Comment


      • #4
        My first thought is that one of them converged and the other did not. It is OK to see outputs like "Iteration 320: log likelihood = -311.82476 (not concave)" along the way. But if the final iteration says that, then your estimation did not converge. (The same applies to "backed up".)

        If one of these converged and the other did not, then I would trust the one that converged and discard the other. If neither converged, then neither set of results is valid.
        If both converged, my second thought would be that they were not run on the same data.

        And could you please advise me on how do I now if the estimated variance is close to zero?
        [/quote]
        You just look at them! Certainly in the outputs you are showing, the variance components are nowhere near zero.

        In the future, please post Stata output using code delimiters, not HTML tables. What you have posted is poorly aligned and difficult to read. Using code delimiters produces a neatly aligned result in a fixed-width font. If you do not know how to use code delimiters, read FAQ #12 for instructions.

        Comment

        Working...
        X