Dear Stata community,
I have 45 variables (named Image1-Image45) which partly overlap gathered from 300 responses.
So for example: Participant 1 has responded to Image1, Image2 and Image3. Participant 2 to Image1, Image2 and Image4. Participant 3 to Image 2, Image3 and Image4 etc.
In total, all participants gave a value (ranging from 0 to 10) to 10 of the 45 variables at random. That is why they sometimes overlap and most of the times don't. Hence, my data is partially paired.
I want to compare the means of these Image1-Image45 variables to see if they differ significantly. Say Image1 has 60 responses and an average rating of 7, Image2 has 70 responses and an average rating of 6, do these two ratings of Image1 and Image2 statistically differ?
An independent t-test is therefore out of question, because some observations overlap. In addition the population sizes are not equal so then you come to Welch's unequal variances test.
However, doing a paired t-test, lot of values will be deleted. Say variable 1 and 2 have both N=80, but only 14 overlapping matches. Then the paired-t test will only use these 14 overlapping variables.
I have read in Bart et al. (1998) - Sampling and statistical methods for behavioural ecologist chapter 3 about partially paired data (google books link: http://bit.ly/2sQH3TZ pp. 72-75). However, I cannot think of a Stata tool/command which I can use to still compare those means.
In using the search function, I could only come to this post on a other Stata forum: https://stats.stackexchange.com/ques...-unpaired-data. They provide mostly mathematical solutions, but I am hoping that there is some statistical command I can use.
So in sum: the dataset has partially overlapping variables, looking for the appropriate t-test to compare means.
Specifications
Using Stata 13 on Windows 10.
Variables are coded as numeric variables.
I have 45 variables (named Image1-Image45) which partly overlap gathered from 300 responses.
So for example: Participant 1 has responded to Image1, Image2 and Image3. Participant 2 to Image1, Image2 and Image4. Participant 3 to Image 2, Image3 and Image4 etc.
In total, all participants gave a value (ranging from 0 to 10) to 10 of the 45 variables at random. That is why they sometimes overlap and most of the times don't. Hence, my data is partially paired.
I want to compare the means of these Image1-Image45 variables to see if they differ significantly. Say Image1 has 60 responses and an average rating of 7, Image2 has 70 responses and an average rating of 6, do these two ratings of Image1 and Image2 statistically differ?
An independent t-test is therefore out of question, because some observations overlap. In addition the population sizes are not equal so then you come to Welch's unequal variances test.
However, doing a paired t-test, lot of values will be deleted. Say variable 1 and 2 have both N=80, but only 14 overlapping matches. Then the paired-t test will only use these 14 overlapping variables.
I have read in Bart et al. (1998) - Sampling and statistical methods for behavioural ecologist chapter 3 about partially paired data (google books link: http://bit.ly/2sQH3TZ pp. 72-75). However, I cannot think of a Stata tool/command which I can use to still compare those means.
In using the search function, I could only come to this post on a other Stata forum: https://stats.stackexchange.com/ques...-unpaired-data. They provide mostly mathematical solutions, but I am hoping that there is some statistical command I can use.
So in sum: the dataset has partially overlapping variables, looking for the appropriate t-test to compare means.
Specifications
Using Stata 13 on Windows 10.
Variables are coded as numeric variables.
Comment