Hi everyone,
It may sound stupid but this question comes to my mind these days: at which percentile should I winsorize my variable? And should I do this before or after taking log/ln?
My aim is to not let extreme value affect my regression.
For instance, I have the deflated asset for all firms in Compustat (I adopt some filters but not related to financial variables), and they are very different, I cannot persuade myself to winsorize just at 1 and 99 percent, because even they are so different thus even 95 or 90 percentage point value may be the extreme value that affects my regression.
Also I wonder if it is better to see where to cut before taking log or after taking log, because once I took log, everything seems smaller, though one point difference matters much more, so I suppose there is no difference literally, but in practice, which one do you think would be more helpful?
I attach a distribution of my asset variable below.


Thank you so much!
It may sound stupid but this question comes to my mind these days: at which percentile should I winsorize my variable? And should I do this before or after taking log/ln?
My aim is to not let extreme value affect my regression.
For instance, I have the deflated asset for all firms in Compustat (I adopt some filters but not related to financial variables), and they are very different, I cannot persuade myself to winsorize just at 1 and 99 percent, because even they are so different thus even 95 or 90 percentage point value may be the extreme value that affects my regression.
Also I wonder if it is better to see where to cut before taking log or after taking log, because once I took log, everything seems smaller, though one point difference matters much more, so I suppose there is no difference literally, but in practice, which one do you think would be more helpful?
I attach a distribution of my asset variable below.
Thank you so much!
Comment