Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Could I winsorise at 1% and 99% levels for a sample of 500 observations?

    Dear all,

    I have a sample of 500 observations. I detect some outliers of my dependent variable. Thus, I decide to winsorise at 1% and 99% levels. After winsorising, the regression result is significant as I expect while it is not significant if I do not winsorise. However, my supervisor does not allow me to winsorise. She said that my sample is only 500 observations and I am manipulating the result.

    Could you please give me some advice?

    I want to add the box plot here but I don't know how to do. Could you teach me?

    Thank you very much in advance.

    Celine.
    Last edited by Celine Tran; 16 Jan 2019, 23:49.

  • #2
    Depends on your field, if influential papers that are well published in your field and you are citing in your paper windsorise, you will get away with it. If no influential papers do it, probably most people would hold the view of your supervisor. My personal view is that windsorisation is justified only if you have data errors, that is your observations have extreme values (are "outliers") due to data errors. Otherwiese to me windsorisation is manipulation of the results.

    For how to draw box plots check

    help graph box

    Comment


    • #3
      I agree with Joro, modulo the spelling, which is Winsorize or Winsorise, depending on how American or British you feel.

      I'd much rather deal with outliers through a transformation or link function. In my fields of interest, 80 or 90% of such difficulties boil down to thinking logarithmically.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        I agree with Joro, modulo the spelling, which is Winsorize or Winsorise, depending on how American or British you feel.

        I'd much rather deal with outliers through a transformation or link function. In my fields of interest, 80 or 90% of such difficulties boil down to thinking logarithmically.
        Thank you Nick for educating regarding the correct name of the procedure "winsorise" . In fact my error was not a spelling error, I was living under the false impression that the term comes from Windsor (the town in England)... As it turns out the procedure is not named after the town Windsor, but after "the engineer-turned-biostatistician Charles P. Winsor (1895–1951)", Well, one is learning something new every day

        Comment


        • #5
          Windsorising is the process whereby other people are absorbed into British Royalty.

          Comment


          • #6
            Thank you Joro Kolev Nick Cox

            In my field, many authors winsorise their data without explanation. I follow them but my supervisor ask me to justify why.

            Here is the box plot and histogram of the variable with and without wisorisaztion. Could you please have a look and give me some advice.

            Thank you very much in advance.


            Attached Files

            Comment


            • #7
              Word attachments are deprecated here. Many people can't even open them. Please do read https://www.statalist.org/forums/help#stata 12.5 and 12.4 for why and what to do instead.

              Comment


              • #8
                Nick Cox I am sorry because belated response. Here is the boxplot of my interest variable. Based on this graph, I think there are some outliers, but my supervisor require me to justify them. How can I detect the outliers? Could you please help me?

                Thank you very much in advance.
                Attached Files

                Comment


                • #9
                  I would rather see a listing of all the data, including your predictor variables. Whether a data point is an outlier needs to be assessed with respect to all variables, not just the response in isolation.

                  Please do read and act on https://www.statalist.org/forums/help#stata (which among other details explains about posting .png not .gph).

                  Comment

                  Working...
                  X