Wilcoxon Rank Sum Test between countries

Marius Bauer

Join Date: Nov 2020

Posts: 55
#1

Wilcoxon Rank Sum Test between countries

01 Feb 2021, 07:30

Hello everyone,

I was talking to my supervisor and he asked me to do a Wilxoxon ranksum test for my dataset. He wants to see that my datasets are not identical and show differences.

My dataset looks as following: I do have 3 countries which are actually run one by one and are not combined into one dataset. Now I did combine dataset one with dataset two and run the ranksum test by using the following code:

Code:

ranksum zfund, by(country)

HTML Code:

Two-sample Wilcoxon rank-sum (Mann-Whitney) test country obs rank sum expected 1 4291 13414650 13261336 2 1889 5684640 5837954.5 combined 6180 19099290 19099290 unadjusted variance 4.175e+09 adjustment for ties -463574.07 adjusted variance 4.175e+09 Ho: zfund(country==1) = zfund(country==2) z = 2.373 Prob > z = 0.0177

My zfund is my independent variable which indicates my investment volume which is different for each country and time. However, I am confused about the Prob > z = 0.0177 does this not mean that my datasets are similar?

thanks
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30093
#2

01 Feb 2021, 12:34

No. "Prob > z = 0.0177" means "p-value = 0.0177," which, if you are going to do conventional null hypothesis significance testing, means you have a "statistically significant" difference in the value of zfund between the two countries.

That said, if your advisor really said and meant that he wants to see that your data sets are not diagonal, one would not do any statistical test for that. It would require only finding two observations, one from each data set, that are different. I'll assume that you (or he) were just speaking loosely.
Comment
Marius Bauer

Join Date: Nov 2020

Posts: 55
#3

02 Feb 2021, 01:06

I am confussed. I thought the Wilcoxon ranksum test is doint exactly this what you said?

For what else would I use the ranksum test?

Thank you for the explination with the p-value. I guess I missread something there
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35693
#4

02 Feb 2021, 01:31

It seems artificial to me to combine two countries when there are three. So that detail alone implies Kruskal-Wallis, not Mann-Whitney. But, but, but:

1. With thousands of data points, even trivial differences will lead to a report of significance at conventional levels.

2. Is independence of observations within countries satisfied, because that's an underlying assumption here?

3. I don't follow your supervisor's argument as you explain it.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#5

02 Feb 2021, 01:39

Marius:
as Others highlighted in their replies, there something strange in the way your data were appended and analyzed (as requested by your supervisor; what strikes me is the 2:1 countries comparison).
At the risk of tasting plain vanilla (with no chocolate variations), I would have gone -regress- (or -qreg-):

Code:

regress zfund i,country

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Wilcoxon Rank Sum Test between countries

Comment

Comment

Comment

Comment