Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • winsorize the data

    Dear Statalisters,

    I want to set the top and bottom 1% of the sample to r(p1) r(p99) respectively - winsorize the data.

    I did

    sum x, d

    replace x = r(p1) if x< r(p1)

    replace x = r(p99) if x>r(p99)

    in stata13, the first replace works, but the second gives me zero observations,

    I assume I can't use two replace consecutively.

    please let me know if that is the case.

    Best,
    Rochelle

  • #2
    Did you check whether there were any cases that satisfied your second condition? That is, are there any values of x greater than the 99th percentile? If your largest value is equal to the 99th percentile there would be no change made to x.

    Comment


    • #3
      Originally posted by Rochelle Zhang View Post
      in stata13, the first replace works, but the second gives me zero observations,
      Just one possibility: how many observations have values above the 99th percentile? You can check with
      Code:
      count if x>r(p99)

      Comment


      • #4
        Actually, -count if x > r(p99)- may not produce the intended result here, because it will count observations for which x is a missing value.

        Code:
        count if x > r(p99) & !missing(x)

        Comment


        • #5
          Good point about the missing values, Clyde. Although I suspect Rochelle's data has no missing values, it is certainly good practice to add the extra condition in all cases. And this would also apply to Rochelle's second replace command.

          Comment


          • #6
            Many thanks to Sarah, Aspen, and Clyde !!! You were all correct, I did not realize I do not have observation values greater than r(p99), I thought using two replace statements are incorrect.


            Rochelle

            Comment


            • #7
              note that there are user-written programs for this; try -search winsorize-

              Comment


              • #8
                use
                ssc install winsor
                winsor Currentvariablename, gen(Newvariablename) p(0.01)

                Comment


                • #9
                  I have a panel dataset. I have imported it using xtset command. Now I would like to winsorize my dataset using one key variable in the dataset. What is the best way to do this? Can someone suggest the commands needed for this? Do I have download an additional module to do that? Thanks

                  Comment

                  Working...
                  X