Help with swilk result

Basharat Hussain

Join Date: Apr 2016

Posts: 28
#1

Help with swilk result

01 Oct 2019, 02:35

Variable score in a dataset of 1020 records can have only three values: 0, 1, or 2; why would "swilk score" command give a p value of 0.999 please?
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

01 Oct 2019, 02:45

Show us the frequencies please:

Code:

tab score

Try this to see why:

Code:

clear input y f 0 255 1 510 2 255 end expand f swilk f qnorm y

Last edited by Nick Cox; 01 Oct 2019, 02:51.
1 like
Comment

Basharat Hussain

Join Date: Apr 2016
Posts: 28

01 Oct 2019, 08:38

Right Sir.

Code:

. tab score, nolabel

      Score |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        346       34.95       34.95
          1 |        201       20.30       55.25
          2 |        443       44.75      100.00
------------+-----------------------------------
      Total |        990      100.00

. swilk score

                   Shapiro-Wilk W test for normal data

    Variable |        Obs       W           V         z       Prob>z
-------------+------------------------------------------------------
       score |        990    0.99955      0.281    -3.143    0.99916

Regards.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

01 Oct 2019, 09:40

It may clash with expectation but a distribution like that looks approximately normal to swilk. Recalling elementary probability theory that a normal can be a limit of a binomial may help people see that. The real question is why it seems of interest whether this variable is near normal. What do you intend to do with it? Is it a count or ordinal?
1 like
Comment
Basharat Hussain

Join Date: Apr 2016

Posts: 28
#5

02 Oct 2019, 04:50

Thank you for the kind response.
The data are from an evaluation study regarding health facilities' preparedness for disaster management and mass casualty management, preparedness being an ordinal scale variable with 0, 1, and 2 denoting unprepared, partially prepared or fully prepared status for an indicator.
The idea was to compare the scores between various facilities, different locales, and the levels of certain other covariates.
Comparisons were also made for proportions of total possible scores actually achieved by a facility or a sub-component therein.
I was using t tests and ANOVA for comparing scores, based on the relatively large sample size and robustness of the said tests.
Just happened to test for normality of the scores and was surprised.
Normal being a limit of a binomial explains that. Will need to read a bit though.
Stay blessed.
Much obliged.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#6

02 Oct 2019, 04:56

Thanks for the detail.

Watch out: ANOVA and t tests with ordinal scores are widely considered somewhere between dubious and unacceptable. In practice you may well find that (e.g.) Wilcoxon-Mann-Whitney or Kruskal-Wallis gives similar P-values, but there are no guarantees (and the underlying hypotheses are not identical).

Even universities often average ordinal grades even while some of their employees teach that that is precisely what you are not justified in doing.
Comment
Basharat Hussain

Join Date: Apr 2016

Posts: 28
#7

02 Oct 2019, 20:21

My Pleasure Sir.
Learning is all mine.
I have submitted the article already but in retrospect, it'd probably have been better to use the non-parametric alternatives. Will wait for the reviewers' comments now. Geoff Norman seems to have had some interesting encounters with reviewers in this vein (Likert scales, levels of measurement and the "laws" of statistics. Advances in health sciences education : theory and practice. 2010;15(5):625-32).
Best Regards.

Last edited by Basharat Hussain; 02 Oct 2019, 20:28.
Comment
Basharat Hussain

Join Date: Apr 2016

Posts: 28
#8

02 Oct 2019, 22:08

One further question Sir.
Would it be correct to present the result of swilk above if a reviewer expresses reservations about the use of parametric tests?
Regards.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#9

03 Oct 2019, 00:34

If I am the reviewer the fact that swilk doesn't declare the variable to be non-normal is at best a distraction. Is the mean score something that makes sense to use? That's what you have to establish.

I can't speak for other reviewers, real or hypothetical.
Comment
Basharat Hussain

Join Date: Apr 2016

Posts: 28
#10

03 Oct 2019, 04:31

Thank you for the opinion.
The mean score does make sense and is in fact is the only means of assessing a component; there, I think I am on solid ground.
Best Regards.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#11

03 Oct 2019, 07:22

No, no, no. You've convinced yourself. The game is to convince the reviewers.

I said much more on this at https://stats.stackexchange.com/ques...dinal-variable -- without naturally saying everything possible.

it's striking how a question like that attracts answers that just say "No; that's wrong".

That thread is a very rare example of an election going the way I prefer.

Last edited by Nick Cox; 03 Oct 2019, 07:32.
Comment
Basharat Hussain

Join Date: Apr 2016

Posts: 28
#12

04 Oct 2019, 00:00

Ooops! That is right.
Perhaps it might go unchallenged but the simple idea of mean score being a credible statistic can be argued for quite effectively, I think.
Would you please recommend a good text on probability theory?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#13

04 Oct 2019, 00:40

I can recommend good texts on probability theory, indeed several, but none of them I know has anything to say about this kind of issue.
Comment

Announcement

Help with swilk result

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment