Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with swilk result

    Variable score in a dataset of 1020 records can have only three values: 0, 1, or 2; why would "swilk score" command give a p value of 0.999 please?

  • #2
    Show us the frequencies please:


    Code:
    tab score

    Try this to see why:

    Code:
    clear 
    input y f 
    0 255
    1 510  
    2 255
    end 
    
    expand f 
    
    swilk f 
    
    qnorm y
    Last edited by Nick Cox; 01 Oct 2019, 02:51.

    Comment


    • #3
      Right Sir.

      Code:
      . tab score, nolabel
      
            Score |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |        346       34.95       34.95
                1 |        201       20.30       55.25
                2 |        443       44.75      100.00
      ------------+-----------------------------------
            Total |        990      100.00
      
      . swilk score
      
                         Shapiro-Wilk W test for normal data
      
          Variable |        Obs       W           V         z       Prob>z
      -------------+------------------------------------------------------
             score |        990    0.99955      0.281    -3.143    0.99916
      Regards.

      Comment


      • #4
        It may clash with expectation but a distribution like that looks approximately normal to swilk. Recalling elementary probability theory that a normal can be a limit of a binomial may help people see that. The real question is why it seems of interest whether this variable is near normal. What do you intend to do with it? Is it a count or ordinal?

        Comment


        • #5
          Thank you for the kind response.
          The data are from an evaluation study regarding health facilities' preparedness for disaster management and mass casualty management, preparedness being an ordinal scale variable with 0, 1, and 2 denoting unprepared, partially prepared or fully prepared status for an indicator.
          The idea was to compare the scores between various facilities, different locales, and the levels of certain other covariates.
          Comparisons were also made for proportions of total possible scores actually achieved by a facility or a sub-component therein.
          I was using t tests and ANOVA for comparing scores, based on the relatively large sample size and robustness of the said tests.
          Just happened to test for normality of the scores and was surprised.
          Normal being a limit of a binomial explains that. Will need to read a bit though.
          Stay blessed.
          Much obliged.

          Comment


          • #6
            Thanks for the detail.

            Watch out: ANOVA and t tests with ordinal scores are widely considered somewhere between dubious and unacceptable. In practice you may well find that (e.g.) Wilcoxon-Mann-Whitney or Kruskal-Wallis gives similar P-values, but there are no guarantees (and the underlying hypotheses are not identical).

            Even universities often average ordinal grades even while some of their employees teach that that is precisely what you are not justified in doing.

            Comment


            • #7
              My Pleasure Sir.
              Learning is all mine.
              I have submitted the article already but in retrospect, it'd probably have been better to use the non-parametric alternatives. Will wait for the reviewers' comments now. Geoff Norman seems to have had some interesting encounters with reviewers in this vein (Likert scales, levels of measurement and the "laws" of statistics. Advances in health sciences education : theory and practice. 2010;15(5):625-32).
              Best Regards.
              Last edited by Basharat Hussain; 02 Oct 2019, 20:28.

              Comment


              • #8
                One further question Sir.
                Would it be correct to present the result of swilk above if a reviewer expresses reservations about the use of parametric tests?
                Regards.

                Comment


                • #9
                  If I am the reviewer the fact that swilk doesn't declare the variable to be non-normal is at best a distraction. Is the mean score something that makes sense to use? That's what you have to establish.

                  I can't speak for other reviewers, real or hypothetical.

                  Comment


                  • #10
                    Thank you for the opinion.
                    The mean score does make sense and is in fact is the only means of assessing a component; there, I think I am on solid ground.
                    Best Regards.

                    Comment


                    • #11
                      No, no, no. You've convinced yourself. The game is to convince the reviewers.

                      I said much more on this at https://stats.stackexchange.com/ques...dinal-variable -- without naturally saying everything possible.

                      it's striking how a question like that attracts answers that just say "No; that's wrong".

                      That thread is a very rare example of an election going the way I prefer.
                      Last edited by Nick Cox; 03 Oct 2019, 07:32.

                      Comment


                      • #12
                        Ooops! That is right.
                        Perhaps it might go unchallenged but the simple idea of mean score being a credible statistic can be argued for quite effectively, I think.
                        Would you please recommend a good text on probability theory?

                        Comment


                        • #13
                          I can recommend good texts on probability theory, indeed several, but none of them I know has anything to say about this kind of issue.

                          Comment

                          Working...
                          X