Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data, rank and percentiles

    Dear Statalist,
    I have unbalanced panel data for funds from 2000 to 2014. I have monthly observations for the fund's betas but not all funds for the whole period.
    fund(id) /// month //// beta
    1 /// 2000m10 /// 0,987
    1 /// 2000m11 /// 0.654
    1 /// 2000m12 /// 1.112
    1 /// 2001m1 /// 1.022
    1 /// 2000m2 /// 0.944
    2 /// 2001m1 /// 0.888
    2 /// 2001m2 /// 0.921
    2 /// 2001m3 /// 0.765
    2 /// 2001m4 /// 0.876
    2 /// 2001m5 /// 0.645
    2 /// 2001m6 /// 1.213
    3 /// 2005m1 /// 1.001
    3 /// 2005m2 /// 0.732

    Moreover I have a dummy variable indicating if the fund is managed by man or woman at the respective month for which the beta is provided. I want to rank the funds in each month/year period (2000m1, 2000m2, 2000m3, ......) for the entire period 2000 -2014. At the end I want to calculate the share of the women in the different percentiles of betas distribution (top 1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1% ). First I am not sure if the statistics will be correct since there is a different number of observations for each month. And second:
    I am not sure which command to use -pctile- or -xtile- and how to obtain the share of women per month in the respective rank

    I have used the following command:

    egen decile = xtile( Beta), by(month) p(10(10)90)

    but it generates only numbers from 1 to 9 and I can not "translate" this in: top1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1%

    Any help is appreciated




  • #2
    it seems to me that you have two options:
    Code:
    egen decile1=xtile(Beta), by(month) nq(100)
    and then collapse, or recode, into the categories you want, or
    Code:
    egen decile2=xtile(Beta), by(month) p(1 10 40 60 90 99)
    but note that (1) I'm not sure I understand your groupings and may have this wrong and (2) your groupings are not exhaustive and this second method will give every observation a value

    Comment


    • #3
      Rich, thanks for the immediate reply. I am trying to explain again.
      I want to rank fund's betas for each month. After that I have to estimate the percentage of women in the top 1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1% . After that I want to make a graph with the share of women in the respective category for the whole period (should I have use mean or collapse here)
      As a whole I need a graph with the percentage of women in each category. Is this the right way

      Comment


      • #4
        Hello together,

        I am sorry for the repeating my question. I am trying to explain again with more details and data
        I have unbalanced panel data for funds from 2000 to 2014. I have monthly observations for the fund's betas but not all funds for the whole period.

        fund(id) /// month //// beta /// deciles
        1 /// 2000m10 /// 0,987 // 1
        1 /// 2000m11 /// 0.654 /// 2
        1 /// 2000m12 /// 1.112 /// 2
        1 /// 2001m1 /// 1.022 //// 3
        1 /// 2000m2 /// 0.944 /// 2
        2 /// 2001m1 /// 0.888 //// 1
        2 /// 2001m2 /// 0.921 /// 2
        2 /// 2001m3 /// 0.765 /// 7
        2 /// 2001m4 /// 0.876 /// 10
        2 /// 2001m5 /// 0.645 /// 14
        2 /// 2001m6 /// 1.213 /// 11
        3 /// 2005m1 /// 1.001 /// 5
        3 /// 2005m2 /// 0.732 /// 2


        Moreover I have a dummy variable indicating if the fund is managed by man or woman at the respective month for which the beta is provided. I want to rank the funds in each month/year period (2000m1, 2000m2, 2000m3, ......) for the entire period 2000 -2014. At the end I want to calculate the share of the women in the different percentiles of betas distribution (top 1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1% ). First I am not sure if the statistics will be correct since there is a different number of observations for each month. And second:
        I am not sure which command to use -pctile- or -xtile- and how to obtain the share of women per month in the respective rank

        I want to replicate one paper in which is stated that if the women follow an extreme strategy(with respect to Beta) than the Beta for women should be in the tail of the distribution.They compute the share of women in different percentiles in the distribution of Beta.

        I have used the following command:

        egen deciles=xtile( Beta), by (month) nq(20)
        egen all_funds = count( wficn), by (month deciles)

        egen male_funds = count( wficn) if females==0, by (month deciles)

        egen female_funds = count( wficn) if females==1, by(month deciles)
        gen p_male_funds = male_funds/all_funds if all_funds > 0
        replace p_male_funds = 0 if male_funds == 0 & all_funds > 0
        gen p_female_funds = female_funds/all_funds if all_funds > 0
        replace p_female_funds = 0 if female_funds == 0 & all_funds > 0

        Here is a citation from the paper that I try to replicate. In the paper the authors make this with single and team-managed funds. I have to use the same idea for women and men in my work

        "To get a first idea about the extremity of a fund’s investment style we analyse
        the distribution of the factor loadings β1 to β4 from model (1) for team- and singlemanaged
        funds. If a fund follows an extreme strategy with respect to a specific style
        dimension, its factor loadings are more likely to be in the tail of the distribution of
        all fund’s factor loadings in the same year. Thus, if the diversification of opinions
        Hypothesis 1 holds, we should observe a larger fraction of single-managed funds in
        the most extreme percentiles of the distribution of factor loadings.These shares are calculated as the average of the
        respective yearly shares over our sample period. This ensures that our results are not
        driven by shifting style preferences within the mutual fund industry in combination
        with the increased share of team-managed funds."

        Their results show for example that the share of single managers is 73% of the top10% Betas and 70% of the top 10-20% betas.

        There is something wrong in this code. My results show that men are only 48% of the top10% Betas and this couldn't be true. In fact men follow more extreme strategies and have to have higher betas than women in the top and the bottom of the distribution.

        I am not sure how they make this (cumulative or not).

        Could someone tell me is this is the right way: I want to rank fund's betas for each month. After that I have to estimate the percentage of women in the top 1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1% . After that I want to make a graph with the share of women in the respective category for the whole period of 13 years. As a whole I need a graph with the percentage of women in each category.

        Could someone help me please?

        Comment


        • #5
          Anybody? I really need help please?

          Comment


          • #6
            Comments here http://www.statalist.org/forums/foru...nd-percentiles I think imply that you may need to rewrite this radically to get a response.

            Comment


            • #7
              OK Lets say it differently. An example with my data is presented above.I have monthly observations for the fund's betas but not all funds have betas for each month. For example for some months there are 300 betas and for others 500 betas. I also have dummy indicating single or team-managed fund. I want to ranking betas for each month and after that to estimate the percentage of single-managed funds in each rank (top 1%, top 10-40%,middle 20%, bottom 10-40%, bottom 1% ) for each month. After that I want to estimate the average percentage of single funds for the whole period by different ranks. Here is an citation from the paper that I have read and the authors make something similar but with yearly observations.

              "To get a first idea about the extremity of a fund’s investment style we analyse
              the distribution of the factor loadings β1 to β4 from model (1) for team- and singlemanaged
              funds. If a fund follows an extreme strategy with respect to a specific style
              dimension, its factor loadings are more likely to be in the tail of the distribution of
              all fund’s factor loadings in the same year. Thus, if the diversification of opinions
              Hypothesis 1 holds, we should observe a larger fraction of single-managed funds in
              the most extreme percentiles of the distribution of factor loadings.These shares are calculated as the average of the
              respective yearly shares over our sample period. This ensures that our results are not
              driven by shifting style preferences within the mutual fund industry in combination
              with the increased share of team-managed funds."

              Here is my code:

              egen deciles=xtile( Beta), by (month) nq(10)
              egen all_funds = count( wficn), by (month deciles)

              egen singlef = count( wficn) if team==0, by (month deciles)

              egen teamf = count( wficn) if team==1, by(month deciles)
              gen p_single_funds = singlef/all_funds if all_funds > 0
              replace p_single_funds = 0 if singlef == 0 & all_funds > 0
              gen p_team_funds = teamf/all_funds if all_funds > 0
              replace p_team_funds = 0 if teamf == 0 & all_funds > 0

              The problem is when I use -collapse- (collapse p_single_funds, by (deciles)) I obtain some percentages but they are very different from the expected results. I obtain fro example 45% single funds in the top 10%betas, 48 % in the top 30% betas, 53% in the bottom 30% betas and 55% in the bottom 10% betas. Actually the percentage of single funds should be U-shaped formed (that means that there are more single managers in the both tails of the distribution of betas) - the higher percentage of single funds should be in the top 10% and in the bottom 10% betas.

              Is something wrong in my code or the way that I am trying to do it compared to the paper above. I am pretty sure about that how should the results.
              Moreover I want to know if there is a command in stata that can direct show the top 1%, top 10% of variable, bottom 20%, bottom 10%, bottom 1%.

              Thanks in advance

              Comment


              • #8
                Anybody?

                Comment


                • #9
                  Could someone help me?

                  Comment


                  • #10
                    hello, i have the following data, i already grouped it in quantiles
                    Code:
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input byte nq_incftx long totinc
                    5 44257
                    1  7067
                    1  6124
                    2 11323
                    5 53423
                    5 34801
                    2 10864
                    5 37855
                    5 52098
                    3 22048
                    3 26063
                    4 30409
                    2 11153
                    3 19812
                    1  5082
                    3 21691
                    1  6500
                    3 24261
                    4 28515
                    4 29481
                    3 17840
                    1  6212
                    1  7486
                    1  7895
                    5 42572
                    1  6420
                    3 21365
                    4 30075
                    1  5358
                    3 19847
                    3 22379
                    5 52834
                    4 26701
                    3 17445
                    1  7604
                    2 12507
                    1  6623
                    3 23164
                    4 26270
                    3 24024
                    4 26117
                    4 31047
                    4 32187
                    5 43226
                    2 15190
                    5 43257
                    4 32615
                    1  7092
                    3 23515
                    2 10471
                    4 26898
                    5 43160
                    2 10276
                    5 40343
                    4 28562
                    1  5400
                    4 28710
                    2 11704
                    5 35480
                    3 28512
                    1  8296
                    4 31798
                    4 31912
                    5 38767
                    1  5324
                    1  5970
                    5 45254
                    4 29520
                    4 31959
                    1  5091
                    3 21586
                    5 36165
                    5 44206
                    5 43224
                    2 10509
                    2 13508
                    4 31411
                    3 20315
                    1  4570
                    3 18264
                    5 38677
                    2  9334
                    5 37884
                    5 58665
                    1  6654
                    4 27402
                    4 30314
                    4 29864
                    2 13690
                    4 30850
                    1  5091
                    1  7429
                    4 27285
                    1  5091
                    4 33758
                    2  9427
                    3 23722
                    3 25700
                    4 26776
                    1  5552
                    end
                    i was wondering if anyone could help me to calculate the growth rate of income by using percentile ratio
                    thanks in advanced

                    Comment

                    Working...
                    X