Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help Mixed effect model for my PhD project

    Hello everyone,

    I will really appreciate if you kindly suggest me how to develop correct Mixed Effect Models for the following research objectives:
    1. to examine the extend to which the effect of immigration status (the variable contains three categories: Native-born, Recent Immigrants, and Long Residing Immigrants) on the mental health (the variable, MH, the Positive Mental Health scale) vary in term of region of residence (the variable contains 5 categories), regional economic condition (regional GDP, it is a discrete variable) and regional composition of minority people (the variable contains total number of visible minorities by region, the variable, VisMin, is also a discrete variable).
    I am planning to develop the model like:

    mixed MH c.GDP##ib1.immigrant c.VisMin##ib1.immigrant [pw=WTS_M] | | Province: ib4.Province#ib2.immigrant, covariance (unstructured) mle

    2. to examine the extend to which the effect of post-migration stress (the variable is a scale) on the mental health (the variable, MH, the Positive Mental Health scale) vary in term of region of residence (the variable contains 5 categories), regional economic condition (regional GDP, it is a discrete variable) and regional composition of minority people (the variable contains total number of visible minorities by region, the variable, VisMin, is also a discrete variable).
    I am planning to develop the model like:

    mixed MH c.GDP##c.post-stress c.VisMin##c.post-stress [pw=WTS_M] | | Province: ib4.Province#c.post-stress, covariance (unstructured) mle

    Can you please with your comment on the model plan for my model plan?

    Attention Clyde Schechter Carlo Lazzaro please

  • #2
    I am not able to comment on whether these models make sense from an economics/demographics perspective. That is well outside my subject matter knowledge domain. If Carlo Lazzaro or one of the other economists who frequently post here responds, he may well be able to shed light on this aspect of things.

    I can tell you that both models are ill-formed in terms of the Stata coding and are unlikely to represent valid models of anything. The problem is that when you specify a random slope, the same slope should also appear in the fixed-effects portion of the model. So, the two regression commands you show should instead be:

    Code:
    mixed MH c.GDP##ib1.immigrant c.VisMin##ib1.immigrant ib4.Province#ib2.immigrant [pw=WTS_M] | | Province: ib4.Province#ib2.immigrant, covariance (unstructured) mle
    
    mixed MH c.GDP##c.post-stress c.VisMin##c.post-stress ib4.Province#c.post-stress [pw=WTS_M] | | Province: ib4.Province#c.post-stress, covariance (unstructured) mle
    Failure to include those terms will not prevent Stata from executing the command, and in general it will not interfere with convergence of the estimation. But when those terms are omitted, it is equivalent to imposing a constraint that the mean slopes of the ib4.Prvoince#ib2.immigrant terms (and the corresponding ones in the second command) are zero. That constraint is unlikely to be valid as a structural constraint in many situations, and if it fits the data well in your case, that is likely a coincidence.

    I also wonder why you are using the # operator for the random slope of this interaction when you use the ## operator for the other interactions. There is nothing wrong with this; they are two different ways to parameterize the same thing. But it may make it confusing to interpret your results since the outputs from ## and # look very much alike, but have different meanings. So you will have to exercise care in working with the outputs.

    As a minor detail, I note that, unless you are using an ancient version of Stata, it is unnecessary to specify the -mle- option, as maximum likelihood has been the default estimation method for -mixed- for many years now. It is at least as old as the name -mixed- itself, if I recall. The original command for this kind of model was called -xtmixed- and used -reml- as the default estimation. But that was a very long time ago. Then again, any version that old would not accept factor-variable notation in the random effects anyway, so, necessarily you are not running so archaic a version. Of course, no harm is done by specifying -mle- in any case.


    Comment


    • #3
      Clyde Schechter,
      Thank you so much, I completely understand your explanation and suggestion. My only confusion here is how to interpret the result by using ib4.province#ib2.immigrantboth in the fixed effect and random effect sides. If I am not wrong, by using them in the fixed effect side, I can argue that the effect of immigration status on the mental heath may vary across province of residence. So, my confusion is, do I need to use it in the random effect side? If you don't mind, please help me with a few more suggestions on the following queries:

      1. In addition to using the variable province in the random effect side as a random intercept, is it statistically correct if we use the same variable either in the fixed effect side or in the random effect side as a random slope or both? If there is not issue, can you please suggest me two/three articles/book chapters?

      2. Do you think the following modifications of the model plan will make sense and statistically correct?

      mixed MH c.GDP##ib1.immigrant c.VisMin##ib1.immigrant ib4.Province##ib2.immigrant [pw=WTS_M] | | Province: , mle

      mixed MH c.GDP##c.post-stress c.VisMin##c.post-stress ib4.Province##c.post-stress [pw=WTS_M] | | Province: , mle

      I really appreciate your support.

      Comment


      • #4
        My only confusion here is how to interpret the result by using ib4.province#ib2.immigrantboth in the fixed effect and random effect sides. If I am not wrong, by using them in the fixed effect side, I can argue that the effect of immigration status on the mental heath may vary across province of residence. So, my confusion is, do I need to use it in the random effect side?
        You are correct, and I apologize for not paying careful enough attention in my response. This definitely should only be in the fixed effects. In fact, on deeper reflection, I think that if you include it in the random effects you will end up with an unidentified model that will fail to converge.

        In addition to using the variable province in the random effect side as a random intercept, is it statistically correct if we use the same variable either in the fixed effect side or in the random effect side as a random slope or both?
        Any variable for which you specify a random slope at the province level should also appear in the fixed effects level. Failure to do that is equivalent to constraining the mean of the random slopes to zero. However, the variable that defines a level, province in your case, and gets a random intercept, must not also appear in the fixed effects level. (This is a good reason for using # instead of ## with interaction terms that involve province.)

        Do you think the following modifications of the model plan will make sense and statistically correct?

        mixed MH c.GDP##ib1.immigrant c.VisMin##ib1.immigrant ib4.Province##ib2.immigrant [pw=WTS_M] | | Province: , mle

        mixed MH c.GDP##c.post-stress c.VisMin##c.post-stress ib4.Province##c.post-stress [pw=WTS_M] | | Province: , mle
        Change the ib4.Province##... interaction terms to ib4.Province#... so that you don't get a fixed Province effect on its own. You will now, as noted earlier, have a mixture of ## and # interactions in your model, so you will need to be careful when you interpret them. Specifically, and you probably know this already, with X1#X2 categorical interactions, X1 and X2 having n1 and n2 levels, respectively, you will get n1*n2 - 1 interaction terms: one will be omitted as the base category, and all of these interaction terms are interpreted as differences in the expected outcome with the corresponding values of X1 and X2 from the expected outcome with the pair of X1 and X2 levels in the base category. By contrast, with X1##X2, you get n1-1 X1 terms, n2-1 X2 terms, and (n1-1)*(n2-1) interaction terms. Here the X1 terms represent expected outcome values with the corresponding value of X1 conditional on X2 = 0, and the X2 terms represent expected outcome values withh the corresponding value of X2 conditional on X1 = 0. The interaction terms in the output then represent increments to be added to the corresponding X1 and X2 coefficients to calculate the expected outcome value with X1 and X2 equal to the corresponding values. (Rather than doing those calculations, it is better to use -margins-.) When, as in your model, one of the interacted variables is continuous it is a little bit simpler. But, again, your best bet is to use -margins- to calculate expected outcomes and marginal effects.


        Comment

        Working...
        X