Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test for normality in a sub-sample of a complex survey design

    Hi folks.

    How can I test for normality in a sub-sample inside my main survey sample?
    I wrote svy: swilk varoutcome (not even accounting for a sub-sample) and the output was swilk is not supported by svy with vce(linearized).
    I can, on the other hand, test for normality in a sub-sample using swilk varoutcome if subgroup == 1 , but not accounting for a survey sample...

    How can I do them both? The option now is to test for normality in my sub-sample, not weighting for the complex sample design, but it might be biased, right?!


  • #2
    The stated problem is that you can't do this in practice.

    I want to suggest that it makes little sense in principle, which is my guess at why this isn't implemented. The implied sample size is typically massive and the implied spikiness of distribution considerable. Would you ever fail to reject a null in these circumstances?

    I am all for looking at marginal distributions as context in any project, which may often indicate skewness and/or outliers that need thought, or imply the usefulness of transformed scales or particular link functions, However, testing for normality is highly over-rated. Looking at marginal distributions means, most usefully, looking at graphs and (circumspectly) summary statistics.

    Comment


    • #3
      Hi Nick. Thanks for the reply!

      My sample has ~ 5000, and my sub-sample ~ 2300, I'm afraid they could diverge in the distribution (considering also weighting), but maybe is just that I'm not so experienced with this data treatment / analysis, which makes me more conservative ahah. Would you feel confortable with this numbers/size to go on? My sample is not normal, so I'm handling my sub-sample that way, which now is causing me trouble, bc my best fitting model (gamma) is ~20 000 AIC.

      Comment


      • #4
        If you're confident that your data are not normal then with that sample size a Shapiro-Wilk test is pointless, akin to asking an expert to establish that what you know is a giraffe is not an elephant.

        A comparison of one subsample with its complement will still likely be highly relevant for the rest of your analysis. Quite how you do this is up to you. I tend to prefer quantile plots but histograms may work fine.

        Comment

        Working...
        X