Hi
I am trying to create deciles based on a certain variable (this variable is negative for all observations). I use the following code to create 10 deciles that are updated each year:
egen decile = xtile(l_op2) , by(yr) p(1/9)
I get 10 deciles each year. The only problem is that all deciles except decile 10 have fairly similar number of observations within each decile. Decile 10 has too many observations comparing to other deciles.For example, I get:
. sum l_op2 if decile==10 & yr==2011
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_ op2 | 8 -.3147382 .0837412 -.4147727 -.2305505
. sum l_op2 if decile==10 & yr==2011
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 698 -.0048349 .0064964 -.0293935 -7.32e-07
This pattern has almost been for all years that is the average across all years is:
. sum l_op2 if decile==1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 135 -.2464536 .1394698 -.6817232 -.0689984
. sum l_op2 if decile==10
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 10361 -.0060971 .0090912 -.0675198 -6.19e-08
The only problem is that decile 10 has many obs comparing to others ?
How can I fix this problem ?
I am trying to create deciles based on a certain variable (this variable is negative for all observations). I use the following code to create 10 deciles that are updated each year:
egen decile = xtile(l_op2) , by(yr) p(1/9)
I get 10 deciles each year. The only problem is that all deciles except decile 10 have fairly similar number of observations within each decile. Decile 10 has too many observations comparing to other deciles.For example, I get:
. sum l_op2 if decile==10 & yr==2011
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_ op2 | 8 -.3147382 .0837412 -.4147727 -.2305505
. sum l_op2 if decile==10 & yr==2011
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 698 -.0048349 .0064964 -.0293935 -7.32e-07
This pattern has almost been for all years that is the average across all years is:
. sum l_op2 if decile==1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 135 -.2464536 .1394698 -.6817232 -.0689984
. sum l_op2 if decile==10
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
l_op2 | 10361 -.0060971 .0090912 -.0675198 -6.19e-08
The only problem is that decile 10 has many obs comparing to others ?
How can I fix this problem ?
Comment