ANOVA and Post-hoc estimates using data including multiple imputations

Maleika Krüger

Join Date: Oct 2014

Posts: 19
#1

ANOVA and Post-hoc estimates using data including multiple imputations

15 Apr 2016, 07:06

Dear all,

once more, multiple imputation is the bane of my existence
I hope somebody can help me with what started out as a small little calculation. I couldn't find much info on the web and the one I found were all looking at multi-factor designs and/or repeated measures, which is not what I am trying to do and I didn't get how I could apply that to my problem. Whats more, the imputations in my data set make it very hard to adapt anything to my problem, since most commands do not work. I am using Stata 13.

My main problem is: I am simply trying to simpy compare five group means for a 4 point scale.
But since the data had a significant amount of missing data, I decided to use multiple imputations (m=5) to handle the data. I was easily able to get a table combining the means of each group over all imputations using

Code:

mi estimate: mean scale, over(gruppe)

Now I want to know if the difference in the means are signifcant and which groups differ from others. Normally I would simply use the anova or oneway command with post-hoc tests and be done with it. The Problem is stata does not allow the anova or oneway. command with the mi: estimate

After some digging I found kind of a work around, by using the mixed command with effect coding (following the advise from the articel Ginkel and Kroonenberg, 2014).
This is the Syntax I used:

Code:

mi estimate: mixed scale Gruppe2 Gruppe3 Gruppe4 Gruppe5 ||

So fare so good. But as far as I understood, the results from this analysis only show me which groups differ signifcantly from the grand mean, but not which groups differ signficantly from each other, as I would get from post-hoc tests. Unfortunately Ginkel and Kroonenberg do not adress post-hoc tests.

In addition I discovered that I can force Stata to run the anova command anyway, by using the cmdok option. However, the result look more like the ones from the mixed model. So basically I end up with the same Information.

I tried to use the contrast postestimation command to get a post-hoc like result. But, once again, that doesn't work with mi:estimate. If I try to force it to run, using the cmdok option again, I get an error message.

Code:

requested action not valid after most recent estimation command an error occurred when mi estimate executed contrast on m=1

Does somebody know a way around it? I simply need a measure to see if the groups differ in order to report it.

I would gladly appreciate any help or suggestions.

Thanks in advance.

P.S: Sorry for the Long post. I hope it makes any sence at all.

Last edited by Maleika Krüger; 15 Apr 2016, 07:14.
Tags: None
Dick Campbell

Join Date: Apr 2014

Posts: 279
#2

15 Apr 2016, 08:35

Why do you think you need to use mixed? If you thought you could do the analysis with ANOVA you should be able to set up a comparable analysis in regress. You need to figure out how to get the contrasts you want following mi estimate: reg. See the mi test command which allows linear contrasts among coefficients. If you do this, be sure to read the introduction section of the documentation on post tests. You probably should do many more than five imputations. Finally, the preference on Statalist is for users to provide a full name. See the FAQ #4 for how and why to do this.

Richard T. Campbell
Emeritus Professor of Biostatistics and Sociology
University of Illinois at Chicago
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#3

15 Apr 2016, 09:03

See whether mimrgns (SSC) can help here. Maybe something like

Code:

mi estimate : regress scale i.Gruppe mimrgns Gruppe , pwcompare

Note that you need to use factor variable notation not create the indicator (dummy) variables yourself.

Best
Daniel
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#4

16 Apr 2016, 01:13

By the way,

Code:

mi estimate: mixed scale Gruppe2 Gruppe3 Gruppe4 Gruppe5 ||

So fare so good. But as far as I understood, the results from this analysis only show me which groups differ signifcantly from the grand mean,

this is not true. These results (the coefficients) express differences to the base level (Gruppe 1) not the grand mean.

Best
Daniel
Comment
Maleika Krüger

Join Date: Oct 2014

Posts: 19
#5

18 Apr 2016, 10:54

Thank you both so much for your valuable input.

First off, yes I am aware of the full name policy and I changed it right after posting the thread and it should be activated now.
Second, sorry it took me a little bit longer to reply. I wanted to try out your suggested solutions first.

Originally posted by Dick Campbell View Post

Why do you think you need to use mixed? If you thought you could do the analysis with ANOVA you should be able to set up a comparable analysis in regress. You need to figure out how to get the contrasts you want following mi estimate: reg. See the mi test command which allows linear contrasts among coefficients. If you do this, be sure to read the introduction section of the documentation on post tests.

I understood from the literature of Ginkel & Krooneberg that mixed is the appropriate method. I am not an expert on this matter and let myself be guided by their work.
After looking into it, I agree with you. Its probably since the examples I could find always had repeated measures over time and much more factors in general.
I used regress now.

Daniel thank you for the suggestion with mimrgns , but after looking into the matter I decided to go with Dick's mi test command since it was easier for me to understand.

Dick, I saw what you meant with >5 imputations in the introduction for mi test and I will increase the number.

Originally posted by daniel klein View Post

By the way,

this is not true. These results (the coefficients) express differences to the base level (Gruppe 1) not the grand mean.

Best
Daniel

As I said in my initial post, I didn't use simple dummy variables on purpose, but used effect coding instead. I created the effect dummies like this

Code:

recode gruppe (2 = 1) (1 = -1) (nonmissing = 0), gen(ek

Code:

gruppe2) recode gruppe (3 = 1) (1 = -1) (nonmissing = 0), gen(ekgruppe3) recode gruppe (4 = 1) (1 = -1) (nonmissing = 0), gen(egkruppe4) recode gruppe (5 = 1) (1 = -1) (nonmissing = 0), gen(ekgruppe5)

with category 1 as the baseline category.
As far as I know with effect coding the coefficients will then express the difference to the grand mean instead of the base Level (as with normal dummy variables). It seemed like the more logical choice to me at the time, because it follows the logic of ANOVAs more (again I let myself be guided by the work of Ginkel & Kroonenberg, 2014), but I would be happy to hear your opinion on this.

Last edited by Maleika Krüger; 18 Apr 2016, 11:36.
Comment
Maleika Krüger

Join Date: Oct 2014

Posts: 19
#6

18 Apr 2016, 11:33

Lastly after trying out the mi test command I was wondering if I could bother you once more to see if I did it correctly and answer a few questions about interpreting the results?

Note: I haven't adjusted the number of imputations, yet, since I wanted to get this right, before going a step back and fixing it.

I calculated the model below:

Code:

mi estimate, saving(saved): regress UntSprSuS ekGruppe2 ekGruppe3 ekGruppe4 ekGruppe5 mi estimate (diff1: _b[ekGruppe2]-_b[ekGruppe3]) (diff2: _b[ekGruppe2]-_b[ekGruppe4]) (diff3: _b[ekGruppe2]-_b[ekGruppe5])(diff4: _b[ekGruppe3]-_b[ekGruppe4]) (diff5: _b[ekGruppe3]-_b[ekGruppe5])(diff6: _b[ekGruppe4]-_b[ekGruppe5]) using saved, nocoef mi testtransform diff1 diff2 diff3 diff 4 diff5 diff6

With UntSprSuS being the scale.
I understood that this tests if the coefficients are equal. But does that allow me to say that there is also a significant difference in the group means of the 5 groups?

Results:

Am I right to interpret these results in the following way:
the model is significant overall, meaning there is an influence of group membership on the scale UntSprSuS

but only group 3 and 4 differ signifcantly from the grand mean, but in generel the model is significant (because

looking at the differences, only diff2, diff4 and diff6 show a significant result. Meaning the difference of the coefficient is signicant. But what does that tell me about the difference of the groups in question? Or is this table not important for my question at all and I should only look at the following test results from mi test?

also with using the effect coding I have no way to include the baseline category into the test. So maybe using normal dummy variables would be a better way after all? In that way, wouldn't the constant serve as the effect of the baseline category? So I could include it in the test as well.... something along the line of [FONT=&quot](diffx: _b[ekGruppe2]-_cons). Would that be an Option?

If I look at the test, 3 out of 6 differences were dropped. I looked it up and it said it means that they were redundant. But I don't quite understand why, yet, but I will find it out eventually so thats not my biggest concern right now

My bigger question is, the test was significant overall, but what does that mean for my research question? Again, as above, does that allow me to assume that the means of the five groups do indeed differ significantly from each other/ resp. that the group membership has a significant effect on the variable UntSprSuS?

Starting out with the analysis, I was very confident that I am on the right track and happy that the command worked at all, but the longer I think about it the more I doubt that this is the right way to use the mi test command to get what I want.
I am sorry if these questions are not very clever or if I messed up the whole thing. Thank you in advance for your help. I really appreciate your input on this.

Best,
Maleika

P.S.: sorry for the long posts. I tried to make it easier to follow by splitting them
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#7

18 Apr 2016, 13:19

After some digging I found kind of a work around, by using the mixed command with effect coding

I missed this mentioning of effects coding, so my earlier post is indeed wrong, sorry. Anyway, I do not fully understand why you would go with effects coding

to simpy compare five group means for a 4 point scale.

But then, I am not entirely sure what you want to know exactly. You state that you

want to know if the difference in the means are signifcant and which groups differ from others.

Given your suggestion to use an ANOVA model, I would think that you get what you want from the overall F-Test of a linear regression model where the scale is regressed on a set of indicator (dummy) variables. You can the look at the pairwise comparisons to get an idea which of the means are different from each other. I have shown how to to this. If you do not wish to use mimrgns, then you can construct the contrasts yourself using transformed coefficients, like you did above. The results will be exactly the same.

In general I suggest you write down what you would do ignoring the multiple imputations, then figure out how to do the same thing with multiply imputed data.

Best
Daniel

Last edited by daniel klein; 18 Apr 2016, 13:27.
Comment
Jon Phillips

Join Date: Apr 2015

Posts: 8
#8

04 Dec 2017, 15:53

Hello Daniel,

I was following the instructions you provide above. Specifically, I ran a regression with mi estimate, and then used the mimrgns command with the pwcompare option to identify differences between the groups. However, I cannot seem to get the mimrgns command to provide p-values for the pairwise comparisons. Can you tell me if it is possible to get p-values?

Thanks,
Jon
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#9

10 Dec 2017, 01:34

Originally posted by Jon Phillips View Post

I ran a regression with mi estimate, and then used the mimrgns command with the pwcompare option [...]. However, I cannot seem to get the mimrgns command to provide p-values for the pairwise comparisons.

Jon Phillips, I must have missed your post before. You may already have figured out an answer but let me provide one for those reading this in the future. First, I wish to point to the general advice on problems with mimrgns that I have recently posted. Following the suggestion to run the analysis on the non-imputed data will show that margins with the pwcompare option reports confidence intervals by default. Reading through the help for

Code:

help margins pwcompare

shows that we need to specify , pwcompare(pveffects) to obtain p-values with margins. Since mimrgns is basically a wrapper for margins, we just do the same

Code:

mimrgns ... , pwcompare(pveffects)

Best
Daniel
Comment

Announcement

ANOVA and Post-hoc estimates using data including multiple imputations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment