Mauchly's test of sphericity command

CEdward

Join Date: Nov 2014

Posts: 131
#1

Mauchly's test of sphericity command

10 Jan 2019, 18:57

I am trying to test the assumption of sphericity for the purpose of running a repeated measures ANOVA. I did not happen to come across the command for Mauchly's test of sphericity. Does anybody happen to know if such a command exists? If not, are there other alternatives for testing this assumption?
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

10 Jan 2019, 19:48

The output of search sphericity includes a referfence to the user written mauchly command available from SSC. See

Code:

net describe mauchly, from(http://fmwww.bc.edu/RePEc/bocode/m)

for further information and an installation link, or just go for it with

Code:

ssc install mauchly
Comment
CEdward

Join Date: Nov 2014

Posts: 131
#3

11 Jan 2019, 16:31

Thanks, William. So I have a repeated measures dataset as seen below. In some cases I have missing data for an individual at a timepoint. However, I cannot run the mauchly test if there is missing data in the variables. Firstly - is my data set up (longitudinal) correct? How do I overcome the issue of missing data?

Also - would the command: mauchly outcome, work?

This is what my dataset looks like.

input str6 id float(time outcome)
1 2 .
1 1 .
1 3 .43
2 2 .16
2 1 .
2 3 .
end

Last edited by CEdward; 11 Jan 2019, 17:20.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

11 Jan 2019, 17:27

Let me first say that if you have not already done so, you must also install the moremata pacakage also available from SSC

Code:

ssc install moremata

Here are two examples of running mauchly based on the examples in the documentation, but with expanded data so that each person is scored twice in each time period.

Code:

. * does your data look like this?
. list, clean noobs

    person   time   score  
         1      1      30  
         1      1      33  
         1      2      28  
         1      2      30  
         1      3      16  
         1      3      18  
         2      1      14  
         2      1      14  
         2      2      18  
         2      2      15  
         2      3      10  
         2      3       9  
         3      1      24  
         3      1      26  
         3      2      20  
         3      2      17  
         3      3      18  
         3      3      20  
         4      1      38  
         4      1      38  
         4      2      34  
         4      2      36  
         4      3      20  
         4      3      23  
         5      1      26  
         5      1      24  
         5      2      28  
         5      2      28  
         5      3      14  
         5      3      17  

. * then do this
. xtset person
       panel variable:  person (balanced)

. mauchly score, m(time)

Mauchly's Test of Sphericity
________________________________________________________________________________
 Mauchly's W.   Chi2.   d.f.    P-value.   Epsilon_gg.   Epsilon_ff. Lower-bound
________________________________________________________________________________
  0.8364       0.5359     2      .7649       0.8594        1.0000       0.5000
________________________________________________________________________________

Code:

. * does your data look like this?
. list, clean noobs

    person   score1   score2   score3  
         1       30        .        .  
         1       33        .        .  
         1        .       28        .  
         1        .       30        .  
         1        .        .       16  
         1        .        .       18  
         2       14        .        .  
         2       14        .        .  
         2        .       18        .  
         2        .       15        .  
         2        .        .       10  
         2        .        .        9  
         3       24        .        .  
         3       26        .        .  
         3        .       20        .  
         3        .       17        .  
         3        .        .       18  
         3        .        .       20  
         4       38        .        .  
         4       38        .        .  
         4        .       34        .  
         4        .       36        .  
         4        .        .       20  
         4        .        .       23  
         5       26        .        .  
         5       24        .        .  
         5        .       28        .  
         5        .       28        .  
         5        .        .       14  
         5        .        .       17  

. * then do this
. collapse (sum) score1-score3, by(person)

. list, clean noobs

    person   score1   score2   score3  
         1       63       58       34  
         2       28       33       19  
         3       50       37       38  
         4       76       70       43  
         5       50       56       31  

. mauchly score1 score2 score3

Mauchly's Test of Sphericity
________________________________________________________________________________
 Mauchly's W.   Chi2.   d.f.    P-value.   Epsilon_gg.   Epsilon_ff. Lower-bound
________________________________________________________________________________
  0.8364       0.5359     2      .7649       0.8594        1.0000       0.5000
________________________________________________________________________________

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

11 Jan 2019, 18:32

Note that post #3 above is substantially different than it was when I began working on post #4 in response to it, which explains why post #4 doesn't directly respond to post #3 as it is currently written. What it does respond to is

However, I cannot run the mauchly test if there is missing data in the variables

which was present in the earlier version of post #3 as well.

Last edited by William Lisowski; 11 Jan 2019, 18:35.
Comment
CEdward

Join Date: Nov 2014

Posts: 131
#6

11 Jan 2019, 22:48

That was really helpful William. However - when I run the mauchly command I get the following error message: <istmt>: 3499 mm_repeat() not found. Do you know why?
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#7

12 Jan 2019, 06:20

Yes, you have not followed the instructions in the first paragraph of post #4.
Comment
CEdward

Join Date: Nov 2014

Posts: 131
#8

12 Jan 2019, 07:53

Ahhh sorry I applied the second set of codes. For the first set of codes I tried using xtset, but my id variable is a string. How would I get around that? The issue is that I can't simply number them 1, 2, 3, 4, 5 because there are different clusters of ids. This underscores that a repeated measures is ultimately not even an ideal method relative - but I was just doing it for descriptive analyses.

My dataset actually looks like the first example (except there is only 1 score per score* variable), but the issue is that there is missing data in that variable - unlike yours which does not have any. Because of that I get this: Missing values in varlist

Last edited by CEdward; 12 Jan 2019, 08:07.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#9

12 Jan 2019, 09:15

First, let me confirm, did you follow the instructions and install the moremata package from SSC? It is required for mauchly to run either set of codes, although the author of mauchly failed to document that requirement. With problematic data, mauchly may fail before it gets to the point where it attempts to use the mm_repeat() function provided by moremata.

At this point I'm going to have to ask you for example data - I've done about as much guessing about your data as I can, even to the point of giving separate answers in post #4 for two different possibilities. Even the best descriptions of data are no substitute for an actual example of the data. And you need to explain your data - what do you mean by "clusters of IDs" especially in the context of the mauchly command, for example.

Be sure to use the dataex command to present your example data this. If you are running version 15.1 or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use dataex.

I now find that I previously gave you this same advice (click here to see it) but you have chosen not to follow it, so I'll give some further advice that you have only been given implicitly, by reference to the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Please take the time to review the FAQ, especially sections 9-12 on how to best pose your question.

The more you help others understand your problem, the more likely others are to be able to help you solve your problem.
Comment
CEdward

Join Date: Nov 2014

Posts: 131
#10

12 Jan 2019, 10:09

I am sorry William. I think that I'm posting too quickly and not taking the time to explain things. I have downloaded the moremata package from SSC.

input str6 id float(Time outcome) str2 cluster1 cluster2
A 2 . 1 1
A 1 . 1 1
A 3 .43 1 1
B 2 .15 1 2
B 1 . 1 2
B 3 . 1 2
C 2 .86 1 3
C 1 .81 1 3
C 3 .72 1 3
D 2 .75 1 1
D 1 .64 1 1
D .66 1 1
end

As you can see there are missing data for some of the timepoints. The very first example that you posted actually captures the dataset exactly except there is only 1 score per time per individual. The complex part of this is that individuals are clustered within a larger group on the hierarchy (cluster1) which are then clustered within another set of groups that are lower on the hierarchy (cluster2).
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

#11

12 Jan 2019, 11:30

The clustering in the data is apparently irrelevant to the mauchly command as implemented by the author - that is, it has no option for specifying clustering. If the Mauchly test is supposed to take clustering into account, then the mauchly command will not be suitable with your data.

Your example data is not particularly helpful, and does not appear to have been copied directly from the output of dataex in Stata's results window. (See below to see what it looks like when output from dataex, after replacing the missing value of Time in the final observation.) If you need to adjust your data, perhaps for privacy reasons, you should do that before using dataex, not afterwards.

Note that along with your id being string, your cluster1 is also. Both will need to be numeric to be usable in anova. Below you see how I use encode to generate a numeric id that corresponds 1:1 with the string id. Read the output of help encode for details.

In any event, if you have a substantial fraction of ids with missing values in your data (and in this example data half of your ids are missing data), the Mauchly test is likely to be inappropriate. As you can see, there are two possibilities for dropping observations, and the first version (omit just the observations with missing values) produces nonsense, while the second version, which additionally drops the observations with nonmissing values for any id that had missing values, tests only two of the four ids in your data.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str6 id float(Time outcome) str2 cluster1 float cluster2
"A" 2   . "1" 1
"A" 1   . "1" 1
"A" 3 .43 "1" 1
"B" 2 .15 "1" 2
"B" 1   . "1" 2
"B" 3   . "1" 2
"C" 2 .86 "1" 3
"C" 1 .81 "1" 3
"C" 3 .72 "1" 3
"D" 2 .75 "1" 1
"D" 1 .64 "1" 1
"D" 3 .66 "1" 1
end
* identify observations with missing value
generate todrop1 = missing(outcome)
* identify id with any missing values
bysort id: egen todrop2 = max(todrop1)
* create a numeric id corresponding to the string id
encode id, generate(idnum)
xtset idnum
* drop observation with missing values
drop if todrop1
mauchly outcome, m(Time)
* drop id with any missing values
drop if todrop2
mauchly outcome, m(Time)

Code:

. * drop observation with missing values
. drop if todrop1
(4 observations deleted)

. mauchly outcome, m(Time)

Mauchly's Test of Sphericity
________________________________________________________________________________
 Mauchly's W.   Chi2.   d.f.    P-value.   Epsilon_gg.   Epsilon_ff. Lower-bound
________________________________________________________________________________
   1.3e+03       -14.3690     2      1       0.5000        0.5000       0.5000
________________________________________________________________________________

. * drop id with any missing values
. drop if todrop2
(2 observations deleted)

. mauchly outcome, m(Time)

Mauchly's Test of Sphericity
________________________________________________________________________________
 Mauchly's W.   Chi2.   d.f.    P-value.   Epsilon_gg.   Epsilon_ff. Lower-bound
________________________________________________________________________________
   1.3e+03       0.0000     2      1       0.5000        1.0000       0.5000
________________________________________________________________________________

Announcement