Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Winsorizing of subsets

    Dear Forum,

    i would like to winsorize my data in order to controll/minimze the influence of outliers. My dataset consists of ROA figures for severaly companies within several industries. The ROA is listed in one variable for all companies and the industry is specified in a industry dummy variable. Given that the ROA is vastly differing between industries i dont want to winsorize directly my whole ROA variable but winsorize within the frame of each industry and use e.g. the 5 and 95% of e.g. Manufacturing for ROAs of companies within Manufacturing. I installed the winsor package and used the code

    winsor ROA, p(.05) gen(ROA) if Manufacturing==1

    However i get the error message r(198) "option if() not allowed". How could i then apply different Winsorization frames for subsets of variables?

    Many thanks in advance!

  • #2
    Simply relocate the -if- condition.

    Code:
    winsor ROA if Manufacturing==1, p(.05) gen(ROA)

    Comment


    • #3
      Many Thanks Fei! I did so and it worked.
      However i am facing by this the issue of not being able to have all the new winsorized variables under one varible but i have to generate 1 new winsorized variable for each industry. Currently i created 1 variable in which i then merge everthing together and then delete the recently winsor variable - is there a cleaner way?

      Comment


      • #4
        winsor2 is a similar command that has a by(varlist) option.
        ssc install winsor2

        Comment


        • #5
          Lara, in addition to the solution of #4, if you'd still like to run -winsor-, a loop as below may be needed.

          Code:
          tab Manufacturing, matrow(M)
          local maxr = rowsof(M)
          
          gen price_win = .
          forvalues r = 1/`maxr' {
              winsor ROA if Manufacturing == M[`r',1], p(.05) gen(temp)
              replace price_win = temp if Manufacturing == M[`r',1]
              drop temp
          }
          Last edited by Fei Wang; 08 Nov 2021, 11:31.

          Comment

          Working...
          X