Statistical comparison between 6 groups with unequal variance and 1 observation.

Joyce Kim

Join Date: Jan 2019

Posts: 4
#1

Statistical comparison between 6 groups with unequal variance and 1 observation.

30 Jan 2019, 15:59

Dear Statalists,

I am analyzing a dataset which includes two variable "tech changing rate" and "Group". "Group" is a categorical variable from 1 to 6, which means that there are 6 groups. I am trying to see different types of groups have different rates of changing by performing statistical comparison test. My issues are:
1) Variances between the groups are not equal
2) Samples are not independent; the way I categorize groups made each group dependent.
3) Group 4 only has one observation. Group 1 has 170, Group 2 has 250, Group 3 has 700, Group 5 has 30, and Group 6 has 90.

What I have concluded so far is, I cannot use ANOVA since variance and group sizes vary. Also, since each group is not independent, I am guessing that I have to use friedman test but not sure of this. Can anyone share an idea of how I should perform statistical difference tests between theses 6 groups?

I appreciate your advice!!

Thanks!
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

30 Jan 2019, 16:44

There's no information in your group with N = 1 (can't even compute the variance), so you can drop that one. And, depending on what you mean by "dependence," the answer might be difficult or not possible. That aside: Doing a permutation test via Stata's -permute- is one modern way to deal with concerns about ANOVA assumptions.
Comment

Joyce Kim

Join Date: Jan 2019
Posts: 4

30 Jan 2019, 18:09

Originally posted by Mike Lacy View Post

There's no information in your group with N = 1 (can't even compute the variance), so you can drop that one. And, depending on what you mean by "dependence," the answer might be difficult or not possible. That aside: Doing a permutation test via Stata's -permute- is one modern way to deal with concerns about ANOVA assumptions.

Thanks so much, Mike. Here's more explanation on "dependence". The data has 1300 schools with an id. Each observation has an indication about the type of school, like

ID	Index	Public	Private	vocational Group
101	0.1	0	1	0	2
102	0.2	0	1	0	2
103	0.5	0	1	1	5
104	0.6	1	0	0	1
105	0.7	0	0	1	3
106	0	1	0	1	4
107	0.1	0	0	1	3
108	0.9	0	1	1	5
109	0.8	0	0	1	3
110	0.5	0	0	0	6
111	0.6	0	1	0	2
112	0.7	1	0	0	1
113	0.1	0	1	1	5
114	0.9	0	1	0	2
115	0.8	0	0	0	6
116	0.7	0	0	0	6
117	0.1	0	1	1	5
118	0.6	0	1	0	2
119	0.7	0	0	1	3
120	0	0	0	1	3

And the group formation is based on like below

Group	Public	Private	Vocational
1	1	0	0
2	0	1	0
3	0	0	1
4	1	0	1
5	0	1	1
6	0	0	0

Since some of the schools are both public and vocational or both private or vocational, I thought that each group is not independent from each other. In this case, can I still use -permute- for testing statistical difference between the groups after dropping group 4?

Thanks for your help!

Comment

Mike Lacy

Join Date: Apr 2014

Posts: 2416
#4

30 Jan 2019, 19:47

What you describe is not dependence in the sense relevant here. (Dependence here would mean that an individual school was measured more than once on the outcome variable.) What you have is ameasurement of school type that does not capture the distinctions you think are relevant, for which there is no statistical solution. If you don't care about having a clean measurement of your "group" variable, but want a test that does not rely on the same sampling distribution assumptions as a conventional ANOVA, yes, you could use -permute-. You'd want to read up on the virtues and problems with permutation tests, but they are easy to implement in Stata. If you were satisfied with the F-test as a measure of effect, you could do this:

Code:

permute Group e(F), reps(10000): anova TechChangingRate Group

Given your relatively large sample sizes, this might give you a similar p-value to what the conventional ANOVA would.
Comment
Joyce Kim

Join Date: Jan 2019

Posts: 4
#5

31 Jan 2019, 08:40

Originally posted by Mike Lacy View Post

What you describe is not dependence in the sense relevant here. (Dependence here would mean that an individual school was measured more than once on the outcome variable.) What you have is ameasurement of school type that does not capture the distinctions you think are relevant, for which there is no statistical solution. If you don't care about having a clean measurement of your "group" variable, but want a test that does not rely on the same sampling distribution assumptions as a conventional ANOVA, yes, you could use -permute-. You'd want to read up on the virtues and problems with permutation tests, but they are easy to implement in Stata. If you were satisfied with the F-test as a measure of effect, you could do this:

Code:

permute Group e(F), reps(10000): anova TechChangingRate Group

Given your relatively large sample sizes, this might give you a similar p-value to what the conventional ANOVA would.

Thank you for your advice! I'll try -permute- for the analysis!
Comment

Announcement