Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • swilk test linear regression

    Hey guys,

    I did a linear regression analysis and checked for the normality assumption. The plot (qnorm) and all the other graphs for normality look reasonable but the Swilk-test rejects the normality assumption and I dont't really understand why. Is it because of the sample size or is the model missspecified?

    Click image for larger version

Name:	Normality.png
Views:	1
Size:	51.9 KB
ID:	1595610

    Swilk-test:

    Variable | Obs W V z Prob>z
    -------------+------------------------------------------------------
    red | 707 0.97999 9.207 5.418 0.00000

    Thanks in advance

    Ben

  • #2
    No dataset is exactly normal and yours isn't either. Two mild outliers are evident. What to do about them is a wide open question. I would certainly have a closer look at the data and see if there is an evident story. I would not ever advocate omitting the corresponding observations just because their values look awkward. Normality is the least important "assumption" (better phrase: ideal condition) behind your model, and the stance is not to reject your model because of one imperfection but to try to do better.

    Comment


    • #3
      Benjamin:
      as an aside to Nick's helpful advice, please note that the larger the sample size, the higher the probability that even minor departures from normality (that should affects residual distribution only, and weakly so) may end up in a statistically significant p-value.
      Set aside the evidence that normality it's a matter of textbooks on statistics, there are more substantive issues to investigate in the OLS (or whatever regerssion) post estimation session: for instance does your model give a fair and true view of the data generating process you're interested in?
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks a lot guys. Your advices were really helpful.

        I didn't know this part:
        the larger the sample size, the higher the probability that even minor departures from normality (that should affects residual distribution only, and weakly so) may end up in a statistically significant p-value
        . This is really interesting.

        I will follow Nicks advice to not reject the model because of the Swilk-test. All model assumptions seem to fit except the swilk-test and the results seem reasonable.

        Comment

        Working...
        X