Mixed multilevel models with crossed random effects

Hend She

Join Date: Jul 2020

Posts: 70
#1

Mixed multilevel models with crossed random effects

21 Sep 2022, 04:08

Dear Statalist members,

I have a question regarding the correctness of the specification of the current multilevel models.
The output variable is migration concerns. The main dependent variable that we would like to inspect is the welfare dependency rate "WDR," which is defined as household transfer share*100, where household transfer share is the share of household income from public transfers/ household post-gov't income. Each household is counted once by considering household heads that carry the WDR of their respective households. WDR takes, therefore, values between 0 & 100

Since I am applying multilevel models for the first time in practice, I have a couple of questions:
1. Is this the right way of nesting? (in particular w.r.t _all and R.)

Code:

eststo: mixed concerns /// std_WDR_n_nuts3_syear /// std_WDR_m_nuts3_syear /// || _all:R.nuts3 || syear: if head==0 & sample==1

The above covariates were defined to capture WDR on a regional level per year
where nuts3: districts, syear: survey year, n: dummy for natives, m: dummy for migrants, std refers to standardized (as we standardized all continuous variables in the regression)

Code:

sort nuts3 syear by nuts3 syear: egen WDR_n_nuts3_syear = mean(WDR_n) by nuts3 syear: egen WDR_m_nuts3_syear = mean(WDR_m)

2. Is it right to combine the regionally disaggregated WDR for migrants and natives respectively per year as defined above along with the aggregate WDR (without any region distinction)in one model?

Code:

eststo: mixed concerns /// std_WDR_n_nuts3_syear /// c.std_WDR_m_nuts3_syear /// i.migback##c.std_kr_foreigner_nuts3 /// i.region /// std_age /// i.mar /// i.male /// std_childrennum /// i.emplstat /// i.edu /// std_WDR /// || _all:R.nuts3 || syear: if head==0 & sample==1

Thank you in advance!
Tags: crossed random effects, mixed, multilevel, regression
Clyde Schechter

Join Date: Apr 2014

Posts: 30165
#2

21 Sep 2022, 09:58

1. Is this the right way of nesting? (in particular w.r.t _all and R.)

It is syntactically correct as an expression of crossed random effects between region (nuts3) and year. However, are you sure you want to model survey year as a random effect? It is unusual to do that, and would be most appropriate if the years were randomly sampled from a universe of years. But that is probably not the situation you are dealing with. More likely, there is a relatively small number of years in which the survey was administered, and those were selected deterministically at more or less regular intervals. As such, it would be more usual to simply include i.syear as a covariate in the bottom level. Bear in mind, also, that if the number of years is small (as is the case for most surveys) your variance estimate, based on a small sample of year-space, is likely to be very imprecise and unlikely to be useful for practical purposes.

2. Is it right to combine the regionally disaggregated WDR for migrants and natives respectively per year as defined above along with the aggregate WDR (without any region distinction)in one model?

No.
3 likes
Comment
Hend She

Join Date: Jul 2020

Posts: 70
#3

21 Sep 2022, 11:04

Dear Clyde, thank you very much for your very helpful response! Noted. In this case, we are dealing with five survey years. Could you please type your suggested way of coding w.r.t.1?
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30165

21 Sep 2022, 11:12

Code:

eststo: mixed             concerns                ///
                std_WDR_n_nuts3_syear            ///
                c.std_WDR_m_nuts3_syear            ///
                i.migback##c.std_kr_foreigner_nuts3    ///
                i.region                 ///    
                std_age                    ///    
                i.mar                     ///
                i.male                     ///    
                std_childrennum                 ///
                i.emplstat                ///    
                i.edu                     ///
                std_WDR                 ///
                i.syear                 ///
                if head==0 & sample==1  ///
        || nuts3:

By the way, if you were to do the crossed-effects model, you would still need to move the -if- condition to the fixed-effects level of the model. (I didn't notice that when I responded in #2 saying that the syntax you showed is OK.)

Comment

Hend She

Join Date: Jul 2020

Posts: 70
#5

26 Sep 2022, 07:50

Thank you so much! I guess I also need to adjust the outcome variable here to be standardized as well

Code:

egen std_concerns=std(concerns)

and include the std_concerns in the models rather than just concerns as the outcome variable.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30165
#6

26 Sep 2022, 09:58

In my earlier responses I ignored the issue of standardization. But, since you brought it up, here's my advice about it, for what it's worth: standardization is, in most circumstances, a bad idea. If the variable in question has a natural, or widely understood scale and unit of measurement, all that standardization does is obfuscate the results.* Standardization may be useful when the variable has only an artificial scale (such as number of items endorsed on a questionnaire) that is not widely understood. If I understood

"WDR," which is defined as household transfer share*100, where household transfer share is the share of household income from public transfers/ household post-gov't income.

correctly, it sounds like WDR is some sort of ratio of public transfer income to non-government income (I don't know what post means in this context). If so, every reasonably numerate person will understand this variable when used in its natural metric without standardization. By contrast, when you standardize it, nobody but you will understand it, because only you know the standard deviation and mean used for the standardization. Even if you show the mean and standardization in your results presentation, you are forcing the audience to do unnecessary calculations to decrypt your findings. Unless you have reason to hate your audience, you should not punish them that way. An even clearer situation, I think is the variable childrennum. If that variable is what it's name suggests to me: the number of children the respondent has, then it is frankly ridiculous to standardize it and the only effect of doing so is to make your findings opaque.

As for the concerns variable, you do not say how it is measured. I can imagine that it is some score on a survey scale. If the survey scale is one that is widely used and understood by people in your field, then, again, standardization serves only to make it harder to follow your findings. But if it is an idiosyncratic scale, or one known only in a small nicheof specialists, then standardization is actually helpful.

Always choose the methods that maximize the comprehensibility of your findings to your intended audience. Avoid "mathiness."

*Some people standardize explanatory/predictor variables because they say that this makes it possible to compare their effects and determine which is more important as a determinant of outcome. I will spare you my long diatribe about that, and just give you my bottom line: that is simply an illusion.

Last edited by Clyde Schechter; 26 Sep 2022, 10:01.
1 like
Comment
Hend She

Join Date: Jul 2020

Posts: 70
#7

15 Jul 2023, 12:38

Thanks a lot for your suggestions! The concerns variable is measured on a three-level survey scale, recoded to .(-1: ’not concerned’, 0 ’somewhat concerned’, 1: ’very concerned’). And household post-gov't income is the household net income including government transfers. I would like to verify what you would call the model in this case (your syntax suggestion in #4), particularly at which level which will also reflect on the econometric model notations. Also, I would like to check how you would generate the regional average WDR once for n and once for m, denoted above by WDR_n_nuts3_syear & WDR_m_nuts3_syear (i.e., the syntax for generating these regional averages given this multilevel setting and i.syears as as fixed-effects dummy variables) to make sure my way is correct.

Last edited by Hend She; 15 Jul 2023, 13:05.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30165
#8

15 Jul 2023, 13:17

I would like to verify what you would call the model in this case (your syntax suggestion in #4), particularly at which level which will also reflect on the econometric model notations.

Sorry, but I don't understand what you are asking here. Can you clarify?

would like to check how you would generate the regional average WDR once for n and once for m, denoted above by WDR_n_nuts3_syear & WDR_m_nuts3_syear (i.e., the syntax for generating these regional averages given this multilevel setting and i.syears as as fixed-effects dummy variables) to make sure my way is correct.

Yes, the code for generating these regional averages shown in #1 looks correct.
1 like
Comment
Hend She

Join Date: Jul 2020

Posts: 70
#9

15 Jul 2023, 14:16

Thank you so much, Clyde! In your suggestion #4, this would be a two-level model where survey years are treated as fixed-effect dummy variables, and that is the adopted approach: a hierarchical structure, with individuals nested within districts or NUTS3 regions, correct? and the econometric notation should be: Migration Worries}_{ij} and syear dummies enters the same equation as \sum_{k=2}^{5} \delta_{k} \text{syear}_{k}
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30165
#10

15 Jul 2023, 14:24

You have not until now mentioned any variable called Migration_worries. If this is an alias for the outcome variable concerns, then, yes the notation for it would be as you describe.

And the notation for the s_year indicators is as you describe.
1 like
Comment

Announcement

Mixed multilevel models with crossed random effects

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment