Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Centile code to remove the outlayers

    Dear all
    I used this code to remove the outlayers and it works well. Just I want to know if this code represent winsorizing the data or not.
    centile(variable name ), centile(1, 99)
    drop if (variable name < r(c_1) | variable name> r(c_2) ).
    Thank you in advance for your comments.
    Issa

  • #2
    It is not winsorizing. You could do winsorizing with winsor command, which you can download from SSC. To do so type ssc install winsor.

    However, I am rather sceptical (this is an understatement) about such automagic solutions. Legitimate outliers contain very useful information and should not be discarded. Outliers that are the result of errors need to be identified directly and not based on such automagic rules.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      I agree 200% with Maarten. Winsorizing is replacing values in the tails with values on the inner part of the distribution: hence the three lowest values ranked 1, 2, 3 could be Winsorized two deep by using rank 3 instead of the outermost values; and similarly in the other tail. Or just use one tail.

      With a sample size of 11, we might go:

      use ranks 3, 3, 3, 4 , 5, 6, 7, 8, 9, 9, 9

      or

      use ranks 1, 2, 3, 4, 5, 6, 7, 8, 9, 9, 9

      Winsorizing might be a preliminary to (e.g.) taking an average, but Winsorized data I would not want to use in modelling. Throwing away the far tails is to me also a terrible method. But evidently it is in use.

      Comment


      • #4
        Thank you for your comments. Could you suggest then what is the best way to deal with outlayers?

        Comment


        • #5
          There can't be a universal solution for outliers, assuming that is what you mean. Note that extremes need not be outliers and indeed (especially in two- or higher dimensional spaces) outliers need not be extremes.

          Sometimes, indeed usually in my view, outliers are genuine and should be included in an analysis. Here is one discussion (the title of the thread is a little misleading):

          http://stats.stackexchange.com/quest...iers-with-mean

          There are several books on handling outliers, so the topic resists a one-line recommendation.

          Comment


          • #6
            Thank you Nick for the link which I found very helpful as you can read the statisticians opinions on how to deal with outliers.

            Comment


            • #7
              I am not a statistician....

              Comment


              • #8
                Issa, different people have different opinion about outliers.

                My approach was to leave all legitimate observations and to correct data entry errors.

                Some people say, leave outliers in, but estimate both with and without so you can report both and see if they make any difference.

                Alternatively you may create impulse dummy to control for outliers.

                Comment


                • #9
                  Dear All,

                  I am sorry but I cannot resist adding a comment to this thread to reinforce (as if that was needed!) the views expressed by Maarten and Nick.

                  Issa, I noticed that you work in finance and I think that this is a field where it is particularly important to be careful with outliers. Of course we want to eliminate errors in the data, but we certainly need to keep in mind that extreme events happen; the recent (current?) financial turmoil is an example of that.

                  If you systematically trim, Winsorize, downweight, or mute your “outliers” with dummies, you will end up with a model that describes a world that is not real and where things like crashes never happen. That can have very serious consequences and at present we are all suffering because of that kind of optimism.

                  Outliers have the very useful role to remind us that sometimes strange things happen.

                  All the best,

                  Joao

                  Comment


                  • #10
                    Thank you Joao for your comment.

                    Comment

                    Working...
                    X