Interpreting Stata "mixed" results with categorical + continuous interactions

Kathleen Hupfeld

Join Date: Aug 2017

Posts: 26
#1

Interpreting Stata "mixed" results with categorical + continuous interactions

15 Feb 2018, 18:58

I am trying to run a rather complex linear mixed model on a longitudinal dataset with ~approx. 2,043 subjects. To do this, I used the "mixed command" and wrote the model:

Code:

mixed fullTandImperfect c.height ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9 ev10 ib0.rs1611115_A##c.timesince##ib0.agei_group##GENDER, allbaselevels || hhidpn: timesince, variance mle

This is longitudinal data, in which every person has at least 2 time points and up to 6 time points (but some folks have missing data points).

Here:
-fullTandImperfect = the dependent variable (a continuous measure of balance time)
-height = a covariate of no interest
-ev1-ev10 = 10 covariates controlling for population structure (as is common in genetic association studies)
-ib0.rs1611115_A = a SNP (3 genotype groups, coded 0, 1, and 2)
-timesince = the number of years since baseline visit at each measurement (values = 0-6 years)
-agei_group = the age group at the first measurement (coded as 0 = 65-74 year olds; 1 = 75-84 year olds; 2 = 85+ year olds)
-GENDER = where 0 is male & 1 is female

I attached example output to this post (PDF).

I used the allbaselevels command to try and figure out how to interpret, in particular, the interactions.

My primary research questions are:
1) Does genotype group (rs1611115_A) predict change over time in fullTandImperfect score?

2) Do certain genotype (rs1611115_A) / age groups (agei_group) change differently in fullTandImperfect score over time? (e.g., do 85+ year old AA folks decline faster over time than other groups?)

3) Do certain genotype (rs1611115_A) / gender groups (GENDER) change differently in fullTandImperfect score over time? (e.g., do AA males decline faster over time than other groups?)

4) Do certain genotype (rs1611115_A) / age groups (agei_group) / gender groups (GENDER) change differently in fullTandImperfect score over time? (e.g., do 85+ year old AA males decline faster over time than other groups?)

Right now, I am stuck on a few points:
1) Am I correct in my interpretation that I should be looking at the results of the interactions with timesince for all of these questions (which are related to how groups might change differently over time)?

2) Are the interaction results comparing each group to ALL of the other groups? e.g., if in the attached output, the following interaction is "significant" at p = 0.032, (if for ib0.rs1611115_A, 2 = GG genotype and for GENDER, 2 = female), would this translate to being able to say:

"GG genotype females are declining significantly faster in balance time across years since baseline testing (p < 0.032) than ALL of the other genotype/gender combination groups?"

3) Additionally, since the coefficient here is -5.697933, could we say that GG females are declining by -5.697933 per year? (i.e., that the slope for GG females is -5.697933?)

Code:

rs1611115_A#GENDER#c.timesince 0 1 0 (base) 0 2 0 (base) 1 1 0 (base) 1 2 .3524937 1.224969 0.29 0.774 -2.048402 2.75339 2 1 0 (base) 2 2 -5.697933 2.660539 -2.14 0.032 -10.91249 -.4833723

I want to run this model with a number of different SNPs, so right now my primary concern is how to interpret the model and figure out if this is an appropriate way to answer my research questions.

Attached Files

dbh_allbaselevels_output.pdf (54.7 KB, 2 views)
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

15 Feb 2018, 22:00

Are you sure you need such a complicated model? You are estimating 18 (3 age groups X 3 genotypes X 2 sexes) different slopes for timesince. Worse still, if you really plan to contrast each of them with all the others, that is 153 pairwise comparisons. Do you really want to split it that finely? Your primary research questions do imply that you need this level of complexity. But are these questions really well thought out, or is this just throwing things at the data to see if anything "hits?" I'm thinking you might be able to run a simpler model, which would, in turn, be simpler to interpret. For example, a model with c.timesince##i.(genotype age_group sex) would be easier to work with. But it would only allow you to test whether gender influences the rate of change in the outcome, and whether age group does, and whether sex does, but not combinations of those.

Be that as it may. You are looking at the regression coefficients, and to get from there to the answers to your questions is a lot of very tedious, error-prone coding. Very little of what you want can be read off from the regression output. You will be better off using the -margins- command to do this. For example:

Code:

margins genotype#age_group#sex, dydx(timesince)

will give you the slope of the outcome:timesince relationship in each combination of genotype, age_group, and sex. If you want to do some selected contrasts of specific combinations you can then use -test- or -lincom- to do those.

To turn to your specific questions:
1) Yes. To ask if X's change with time is different from Y's, then you need to look at interactions of the variables that distinguish X and Y with time.
2) No. The implied contrast in the regression output is to the base category of the interaction, i.e. to the group in which all of the variables in the interaction term take on their base values (i.e. 65-74 year old males with genotype 0).
3) No. You can't really draw any conclusion about the slope for GG females from the regression output because in using a four-way interaction model you have stipulated that there is no such thing as that slope. There are three different slopes for GG females, one for each of the three age groups. If you want to average across those three slopes, you can do this:

Code:

margins, dydx(timesince) at(genotype = 2 sex = 1)

That would give you the average marginal effect of timesince among females with genotype 2 (GG).

The -margins- command greatly simplifies the interpretation of interaction models. I recommend you introduce yourself to it by reading the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf. It contains some worked examples of simpler interaction models. After that, read the -margins- chapter of the PDF manuals that are part of your official Stata installation, where some more complicated models are also shown (though I don't recall any examples with 4-way interactions). That should give you enough of a sense of it that you'll be able to figure out how to formulate the right -margins- command for whatever particular slopes you are interested in.

In the future, please do not show output by attaching a PDF. Please copy/paste the output directly into the forum editor, between code delimiters, just as you did with the short excerpt.
1 like
Comment

Kathleen Hupfeld

Join Date: Aug 2017
Posts: 26

17 Feb 2018, 16:55

Hi Clyde,

Thank you very much for your detailed response! That was incredibly helpful.

I had a few follow-up questions, if you have any further input:

I ran

Code:

mixed meanWalk c.height ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9 ev10 ib0.rs4680_A##c.timesince##ib1.agei_group##GENDER || hhidpn:timesince, variance mle
margins rs4680_A#agei_group#GENDER, dydx(timesince) 
testparm i.rs4680_A#i.agei_group#i.GENDER, equal

which produces:

Code:

. testparm i.rs4680_A#i.agei_group#i.GENDER, equal 
 ( 1)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]0bn.rs4680_A#1bn.agei_group#2.GENDER = 0
 ( 2)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]0bn.rs4680_A#2.agei_group#1bn.GENDER = 0
 ( 3)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]0bn.rs4680_A#2.agei_group#2.GENDER = 0
 ( 4)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]0bn.rs4680_A#3.agei_group#1bn.GENDER = 0
 ( 5)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]0bn.rs4680_A#3.agei_group#2.GENDER = 0
 ( 6)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#1bn.agei_group#1bn.GENDER = 0
 ( 7)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#1bn.agei_group#2.GENDER = 0
 ( 8)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#2.agei_group#1bn.GENDER = 0
 ( 9)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#2.agei_group#2.GENDER = 0
 (10)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#3.agei_group#1bn.GENDER = 0
 (11)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]1.rs4680_A#3.agei_group#2.GENDER = 0
 (12)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#1bn.agei_group#1bn.GENDER = 0
 (13)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#1bn.agei_group#2.GENDER = 0
 (14)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#2.agei_group#1bn.GENDER = 0
 (15)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#2.agei_group#2.GENDER = 0
 (16)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#3.agei_group#1bn.GENDER = 0
 (17)  - [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER + [timesince]2.rs4680_A#3.agei_group#2.GENDER = 0

           chi2( 17) =   91.45
         Prob > chi2 =    0.0000

1.) Would this be an omnibus-type test for the slope differences? e.g., is this testing whether there is some difference among the outcome:timesince relationship slopes? (--which would then be pretty useless because it wouldn't be telling us which slope(s) are different?

2.) Relatedly, if I did individual tests using test, testparm, or lincom, this just tells me if two specific slopes are different from each other? & again, this wouldn't be super helpful because I'd have 153 test commands to run (& need to correct for that number of multiple comparisons)?

For instance:

Code:

test _b[timesince:0bn.rs4680_A#1bn.agei_group#1bn.GENDER]=_b[timesince:0bn.rs4680_A#1bn.agei_group#2.GENDER]


 ( 1)  [timesince]0bn.rs4680_A#1bn.agei_group#1bn.GENDER - [timesince]0bn.rs4680_A#1bn.agei_group#2.GENDER = 0

           chi2(  1) =    0.78
         Prob > chi2 =    0.3765

3.) Does the "test" function have any shorthand syntax to run pairwise comparisons of all of the slopes? I didn't see any mention of this in the Stata manual for test.

4.) Is it possible to use the contrast command in conjunction with margins to get comparisons between the slopes for each genotype/age/gender group? e.g., I tried this code, but this doesn't look like it's comparing all of the group slopes?

Code:

 margins r.rs4680_A, over(agei_group GENDER) dydx(timesince) contrast

Contrasts of average marginal effects

Expression   : Linear prediction, fixed portion, predict()
dy/dx w.r.t. : timesince
over         : agei_group GENDER

------------------------------------------------------------------------------------
                                                 |         df        chi2     P>chi2
-------------------------------------------------+----------------------------------
timesince                                        |
                      rs4680_A@agei_group#GENDER |
  (AG, Val/Met vs GG, Val/Val) 65-74 years#Male  |          1        3.11     0.0779
(AG, Val/Met vs GG, Val/Val) 65-74 years#Female  |          1        0.21     0.6481
  (AG, Val/Met vs GG, Val/Val) 75-84 years#Male  |          1        0.92     0.3382
(AG, Val/Met vs GG, Val/Val) 75-84 years#Female  |          1        0.03     0.8570
    (AG, Val/Met vs GG, Val/Val) 85+ years#Male  |          1        2.10     0.1472
  (AG, Val/Met vs GG, Val/Val) 85+ years#Female  |          1        0.68     0.4095
  (AA, Met/Met vs GG, Val/Val) 65-74 years#Male  |          1        0.00     0.9836
(AA, Met/Met vs GG, Val/Val) 65-74 years#Female  |          1        0.50     0.4810
  (AA, Met/Met vs GG, Val/Val) 75-84 years#Male  |          1        0.01     0.9329
(AA, Met/Met vs GG, Val/Val) 75-84 years#Female  |          1        0.49     0.4833
    (AA, Met/Met vs GG, Val/Val) 85+ years#Male  |          1        4.58     0.0324
  (AA, Met/Met vs GG, Val/Val) 85+ years#Female  |          1        0.03     0.8563
                                          Joint  |         12       14.16     0.2904
------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------
                                                 |   Contrast Delta-method
                                                 |      dy/dx   Std. Err.     [95% Conf. Interval]
-------------------------------------------------+------------------------------------------------
timesince                                        |
                      rs4680_A@agei_group#GENDER |
  (AG, Val/Met vs GG, Val/Val) 65-74 years#Male  |   .0385047   .0218394     -.0042998    .0813092
(AG, Val/Met vs GG, Val/Val) 65-74 years#Female  |  -.0084757    .018569     -.0448703    .0279188
  (AG, Val/Met vs GG, Val/Val) 75-84 years#Male  |   .0338541   .0353509     -.0354324    .1031406
(AG, Val/Met vs GG, Val/Val) 75-84 years#Female  |  -.0050431   .0279958     -.0599137    .0498276
    (AG, Val/Met vs GG, Val/Val) 85+ years#Male  |   .1637292   .1129498     -.0576483    .3851066
  (AG, Val/Met vs GG, Val/Val) 85+ years#Female  |   .0580307   .0703549     -.0798624    .1959238
  (AA, Met/Met vs GG, Val/Val) 65-74 years#Male  |  -.0005327    .025882     -.0512604    .0501951
(AA, Met/Met vs GG, Val/Val) 65-74 years#Female  |   .0149254   .0211812      -.026589    .0564397
  (AA, Met/Met vs GG, Val/Val) 75-84 years#Male  |  -.0033847   .0401963      -.082168    .0753987
(AA, Met/Met vs GG, Val/Val) 75-84 years#Female  |  -.0215462   .0307329     -.0817816    .0386891
    (AA, Met/Met vs GG, Val/Val) 85+ years#Male  |   .2544322   .1188981      .0213962    .4874682
  (AA, Met/Met vs GG, Val/Val) 85+ years#Female  |    .014123    .077998     -.1387503    .1669963
--------------------------------------------------------------------------------------------------

Thanks again for all of your help!

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#4

17 Feb 2018, 17:13

1.) Would this be an omnibus-type test for the slope differences? e.g., is this testing whether there is some difference among the outcome:timesince relationship slopes? (--which would then be pretty useless because it wouldn't be telling us which slope(s) are different?

Correct, that is an omnibus test. As for whether it's useless, that depends on your goals. Sometimes that is the paydirt and the individual comparisons are not wanted. I gather your situation is rather the opposite, though.

2.) Relatedly, if I did individual tests using test, testparm, or lincom, this just tells me if two specific slopes are different from each other? & again, this wouldn't be super helpful because I'd have 153 test commands to run (& need to correct for that number of multiple comparisons)?

Right. Individual tests would just tell you about two specific slopes. As for whether you would correct for multiple comparisons, that is a complicated question. In this case, with all 153 comparisons being made, it probably makes sense to adjust for multiple comparisons because it's pretty clear cut how many comparisons you are doing. In the more general situation, it is often unclear how to do this because the actual number of comparisons being made is vague: do you count only those you report, all the ones you did whether reported or not, or all the ones you thought of doing though perhaps you gave up before doing them all? My usual practice is to not correct for multiple comparisons and to warn the reader prominently that I have not done so and leave it to the reader to make of it what he or she will. But that practice is motivated in part by the rather low regard in which I hold p-values in the first place. If it's not worth doing, it's not worth doing well.

3.) Does the "test" function have any shorthand syntax to run pairwise comparisons of all of the slopes? I didn't see any mention of this in the Stata manual for test.

No.

4.) Is it possible to use the contrast command in conjunction with margins to get comparisons between the slopes for each genotype/age/gender group? e.g., I tried this code, but this doesn't look like it's comparing all of the group slopes?

What you want is:

Code:

margins rs4680_A#agei_group#GENDER, dydx(timesince) pwcompare

Note that the -pwcompare- option also has suboptions that allow you to customize what specific outputs you get for each comparison. See the PDF manuals for details.
Your original code not only was omitting many contrasts, the ones it did compute are probably incorrect for your purposes. The -over()- option in -margins- is widely misunderstood: it is used for calculating conditional margins and marginal effects. While there are circumstances where those are what is wanted, more commonly what people want is adjusted margins and marginal effects. To get the latter you cannot use -over()-. You must either list the variables between -margins- and the comma (if they are discrete variables from the original regression) or specify the values you want for them in an -at()- option.
1 like
Comment
Kathleen Hupfeld

Join Date: Aug 2017

Posts: 26
#5

17 Feb 2018, 17:40

Great, that all makes sense. Thanks again for all of your help and detailed answers!
Comment
Kathleen Hupfeld

Join Date: Aug 2017

Posts: 26
#6

18 Feb 2018, 11:50

I had another question come up-- if I run the model & predicted margins with dydx(timesince):

Code:

mixed meanWalk c.height ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9 ev10 ib0.rs6265_A##c.timesince##ib1.agei_group##GENDER || hhidpn: timesince, variance mle margins rs6265_A#agei_group#GENDER, dydx(timesince) marginsplot

..when I do marginsplot here, I believe that it's graphing these predicted slopes for each age/genotype/gender group, correct?

What would be the appropriate way to graph the predicted values of this model over timesince? e.g., if I run:

Code:

margins, at (timesince=(1 (1) 11) rs6265_A=(0 1 2) agei_group=(1 2 3) GENDER=(1 2)) marginsplot

..am I changing the model since I'm asking for the graph to plot predictions at representative values of timesince? Or are the slopes for each age/genotype/gender group that I am predicting in the margins rs6265_A#agei_group#GENDER, dydx(timesince) model the slopes over timesince on this graph?

Best,
Kathleen
Comment

Announcement