Hello everybody,
regarding winsorizing I have a little problem understanding a procedure used in literature. I already know and understand the winsor (ssc) command, e.g. to winsorize variables by their 1st and 99th percentile. However, I have to estimate regressions on industry-year sections and to "winsorizing the regression variables at three standardization deviations each year". Now I'm first confused about how to winsor at standardization deviations. Does it maybe mean that I have to standardize the variables for industry-year and winsorize 3 percentiles? I know that's more a problem of statistically understanding but consequently I also don't know how to execute it in Stata. Furthermore, I don't understand how to winsor each industry year in Stata. I think I have to use loops and tried the following:
However, this code doesn't work since it is not allowed to use in or if combined with winsor and it ignores the standardization-deviation-problem (sorry if the code is completely nonsense, it's one of my first times with loops in Stata
). Maybe anyone has already dealt with such a procedure and may help me?
Thank you!
TM
regarding winsorizing I have a little problem understanding a procedure used in literature. I already know and understand the winsor (ssc) command, e.g. to winsorize variables by their 1st and 99th percentile. However, I have to estimate regressions on industry-year sections and to "winsorizing the regression variables at three standardization deviations each year". Now I'm first confused about how to winsor at standardization deviations. Does it maybe mean that I have to standardize the variables for industry-year and winsorize 3 percentiles? I know that's more a problem of statistically understanding but consequently I also don't know how to execute it in Stata. Furthermore, I don't understand how to winsor each industry year in Stata. I think I have to use loops and tried the following:
Code:
egen industry_year=group(industry year) su industry_year scalar a=r(min) scalar b=r(max) foreach var of varlist x1 x2 x3 x4 { forvalues c=`=scalar(a)'/`=scalar(b)' { winsor `var', p(0.03) gen(w_`var') in `c' replace `var'=w_`var' in `c' in `c' drop w_`var' in `c' in `c' }

Thank you!
TM
Comment