Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating decile portfolio's

    Dear Statalist,

    I would like to ask for help regarding a question I've been pondering. I am replicating portfolio's from the "Quality minus Junk" paper (by Asness & Frazzini: http://www.econ.yale.edu/~shiller/be...i-pedersen.pdf).

    In this paper they basically create 10 quality sorted portfolios containing stocks that each receive a quality score based on certain criteria that they must meet. I have already managed to give each stock (in each month) a certain quality score. This Quality score is made up of the 3 subcomponents of a stock (Profitability, Growth, Safety).
    I have a monthly dataset of U.S. only stocks from jan 1962- dec 2019, so basically 696 months in total.
    This dataset contains around 16000 stocks that have existed through this time period in the NYSE, Nasdaq, AMEX stock exchanges (each stock exchange has its own code in the data)
    I have around 2.5 million observations.

    I basically need to assign all these stocks to 10 different Quality-sorted portfolios in each calendar month. Furthermore, these portfolios must be value-weighted, refreshed every month and rebalanced every month to maintain the value weights. The "value" depends on the variable ME (Market Equity), basically the sum of all the MEs of all stocks in a given month is the market value of the whole stock market in that month.

    an example of my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long datadate str8 tic str10 cusip float ME double exchg float(return zProfit zGrowth zSafety Quality)
     7182 "SGI.2"  "826588105"  15.79375 12 -.022305757    -.904848 -1.5013336   .3825867  -2.0235949
     7212 "SGI.2"  "826588105"  15.08125 12  -.04616204   -.8944374  -1.499703  .35482535  -2.0393155
     7243 "SGI.2"  "826588105"  10.09375 12   -.4015358    -.393708  -.7475409   .8147201  -.32652885
     7273 "SGI.2"  "826588105"      11.4 12   .12169693   -.5363941  -.8091926    .751304  -.59428275
     7304 "SGI.2"  "826588105"   13.7895 12     .189242   -.6279179  -.6645935   .6595777   -.6329337
     7335 "SGI.2"  "826588105"   14.7405 12  .066691376   -.4582379 -1.0410988   .8195408   -.6797959
     7364 "SGI.2"  "826588105"    12.363 12  -.17589067  -.38978845 -1.0613819   .3123595  -1.1388109
     7395 "SGI.2"  "826588105"  12.12525 12 -.019418085   -.3837294 -1.0644997   .7490196   -.6992095
     7882 "SGMA.1" "826613101"   33.8325 14           .   .13978979  -.3400794  1.1989115    .9986219
     7913 "SGMA.1" "826613101"    27.066 14  -.22314355   .57329845  -.3380209  1.1201755    1.355453
     7943 "SGMA.1" "826613101"    24.984 14   -.0800427    .5825597    -.32047  1.0610052    1.323095
     7974 "SGMA.1" "826613101"    28.107 14   .11778303   1.1294665  -.2901447  .24416465   1.0834866
     8004 "SGMA.1" "826613101"    26.025 14  -.07696104   1.0852712  1.6685475   .7322693    3.486088
     8035 "SGMA.1" "826613101"    23.943 14   -.0833816   1.1214297  1.4806896   .7625755    3.364695
     8066 "SGMA.1" "826613101"   24.4635 14  .021506205   1.4396784   1.527881   .3559664    3.323526
     8094 "SGMA.1" "826613101"   22.3815 14  -.08894748   1.4144346  1.4220906   .7389647     3.57549
     8125 "SGMA.1" "826613101"   20.2995 14  -.09763847    1.419679 -.50234103   .9626892    1.880027
     8155 "SGMA.1" "826613101"   22.3385 14   .09763847     1.43505 -1.5833572  .10317506  -.04513226
     8186 "SGMA.1" "826613101"   22.3385 14           0   1.5204192 -1.3395644    .752166    .9330208
     8216 "SGMA.1" "826613101"    19.741 14  -.12361395   1.5233015 -1.2396612   .4900049    .7736452
     8247 "SGMA.1" "826613101"  21.30975 14    .0759859    1.428565   1.366932  .10509898    2.900596
     8278 "SGMA.1" "826613101"  27.54675 14   .25671986   1.1896391   1.241601    .445317    2.876557
     8308 "SGMA.1" "826613101"  27.54675 14           0   1.1900645  1.2403766  .29000685    2.720448
     8339 "SGMA.1" "826613101"     30.68 14   .10724553    .8129184  -.6114678 -.28597745   -.0845268
     8369 "SGMA.1" "826613101"      31.2 14  .016807118    .9251416  -.5184898  .04121526     .447867
     8400 "SGMA.1" "826613101"     29.12 14  -.06899287    .8996775  -.5703049  -.0651526      .26422
     8431 "SGMA.1" "826613101"    31.245 14   .06899287    .4831056  -1.302887  -.7521274   -1.571909
     8459 "SGMA.1" "826613101"   30.2035 14  -.03390155   .48356766 -1.3040712 -.25290722  -1.0734107
     8490 "SGMA.1" "826613101"  35.93175 14    .1736635   .48085925 -1.3136277 -.54764754   -1.380416
     8520 "SGMA.1" "826613101"     39.71 14   .09662683   1.4153862 -1.0415485 -1.0023841   -.6285464
     8551 "SGMA.1" "826613101"    36.575 14   -.0822381    .7521121 -1.4705144  -.6423733  -1.3607756
     8581 "SGMA.1" "826613101"   38.1425 14    .0419642    .7477313  -1.482956 -.29466724  -1.0298922
     8612 "SGMA.1" "826613101"    31.395 14   -.1961149   1.0972514 -.41711265  -1.198525   -.5183862
     8643 "SGMA.1" "826613101"   32.4415 14  .032789823    .9817664  -.4009158  -.6667878   -.0859372
     8673 "SGMA.1" "826613101"  36.10425 14   .10697212    .9682081  -.4123471  -.6204031 -.064542115
     8704 "SGMA.1" "826613101"    36.942 14  .021506205   -.3320117   .7052925 -1.4250026  -1.0517218
     8734 "SGMA.1" "826613101"    38.252 14   .03484673    1.002622   .7934234 -1.0360687    .7599767
     8765 "SGMA.1" "826613101"    38.776 14  .013605652   1.0152515   .7856306  -.9618622    .8390199
     8796 "SGMA.1" "826613101"    38.961 14           0   -.6691055   .8953542  -1.381117  -1.1548682
     8825 "SGMA.1" "826613101"    35.802 14   -.0845574    .7957094  1.0030637 -1.0317022    .7670709
     8856 "SGMA.1" "826613101"    34.749 14 -.029852964    .7969189   .9996265  -.9455084     .851037
     8886 "SGMA.1" "826613101"    33.696 14  -.03077166    .6309318   .7468241 -1.4162977  -.03854179
     8917 "SGMA.1" "826613101"   30.0105 14  -.11583181    .6660623   .7563269  -.9988963   .42349285
     8947 "SGMA.1" "826613101"   31.0635 14  .034486175    .6768184   .7550929  -.9895332     .442378
     8978 "SGMA.1" "826613101"  32.66625 14   -.6435502    .4748054  -.7253925 -1.3462925  -1.5968795
     9009 "SGMA.1" "826613101"   31.6125 14 -.032789823    .5912699  -.7067988   -.273121   -.3886499
     9039 "SGMA.1" "826613101"   31.6125 14           0    .6011924   -.706153  -.2065828   -.3115435
     9070 "SGMA.1" "826613101"    29.344 14  -.06899287    .6761398  -.8200309   -.780265   -.9241561
     9100 "SGMA.1" "826613101"    29.344 14           0     .710171 -1.0089912  -.2301549   -.5289752
     9131 "SGMA.1" "826613101"    28.296 14 -.036367644    .7604399  -.9945419  -.2510132   -.4851152
     9162 "SGMA.1" "826613101"  32.52675 14   .13815033   -.2144503  -1.384211  -.8705583  -2.4692194
     9190 "SGMA.1" "826613101"  28.32975 14  -.13815033  .069460206  -1.444348  -.3765238  -1.7514117
     9221 "SGMA.1" "826613101"  28.32975 14           0  .069752716 -1.4390764 -.41325095  -1.7825747
     9251 "SGMA.1" "826613101"     30.45 14  .071458966   .16047476 -1.0005629  -.9826099   -1.822698
     9282 "SGMA.1" "826613101"      29.4 14  -.03509132    .6375924   -.813287 -.08602772   -.2617223
     9312 "SGMA.1" "826613101"      25.2 14   -.1541507    .6343504  -.8122494  -.3314739  -.50937295
     9343 "SGMA.1" "826613101"    28.647 14   .11778303    .4637481 -.55787504 -1.0230598  -1.1171868
     9374 "SGMA.1" "826613101"   27.0555 14  -.05715841    .6922631  -.4956926 -.56152624   -.3649557
     9404 "SGMA.1" "826613101"    28.647 14   .05715841    .6948391 -.51114607  .05031871   .23401177
     9435 "SGMA.1" "826613101"    35.013 14    .2006707    .4075425  .16796695  -.7709894  -.19547993
     9465 "SGMA.1" "826613101"    36.074 14  .029852964    .8482217  .29036307 -.17123054    .9673542
     9496 "SGMA.1" "826613101"     42.44 14   .16251893    .8611218    .308336 -.09041255   1.0790453
     9527 "SGMA.1" "826613101"   36.0995 14  -.16251893    .3641002  1.0480732  -.6903567    .7218167
     9555 "SGMA.1" "826613101"    38.223 14   .05715841    .5631984  1.0832932  .02595587   1.6724474
     9586 "SGMA.1" "826613101"   31.8525 14  -.18232156    .5644471  1.0811478  .09456871   1.7401636
     9616 "SGMA.1" "826613101"   31.8525 14           0   -.4716276   .7161497  -.5612514   -.3167293
     9647 "SGMA.1" "826613101"  35.03775 14   .09531018    .9122427  1.0112963  .16733856   2.0908775
     9677 "SGMA.1" "826613101" 30.259874 14  -.14660348    .9068546  1.0307956  .10385885   2.0415092
     9708 "SGMA.1" "826613101"  22.29675 14   -.3053817    .6316846   .6509266  -.9019834    .3806277
     9739 "SGMA.1" "826613101"   27.6055 14    .2135741    .6168441   .6385389  -.4733275    .7820555
     9769 "SGMA.1" "826613101" 23.889376 14  -.14458123    .6226215   .6468626 -1.2124445    .0570395
     9800 "SGMA.1" "826613101"   23.3585 14 -.022472857    .7096598   .6962613 -1.5296923   -.1237712
     9830 "SGMA.1" "826613101"    21.235 14  -.09531018    .9126512   .7014312 -1.1456245    .4684579
     9861 "SGMA.1" "826613101"  22.29675 14   .04879016    .9429122   .7208858 -1.0162798    .6475182
     9892 "SGMA.1" "826613101" 23.889376 14   .06899287   -.3611825  -.6346171  -1.590864   -2.586664
     9920 "SGMA.1" "826613101"  22.29675 14  -.06899287    .3085502  -.5258574 -1.2652103  -1.4825175
     9951 "SGMA.1" "826613101"    21.235 14  -.04879016    .3088641  -.5286109 -1.3219784   -1.541725
     9981 "SGMA.1" "826613101"    21.235 14           0   1.1570412  -.8669893 -1.6363224  -1.3462706
    10012 "SGMA.1" "826613101"    21.235 14           0    .9750245  -.9313188 -1.3190898   -1.275384
    10042 "SGMA.1" "826613101"    21.235 14           0    .9766609  -.9305343 -1.3382547   -1.292128
    10073 "SGMA.1" "826613101"    21.235 14           0    .4624018  -.4641969  -1.408453   -1.410248
    10104 "SGMA.1" "826613101" 20.704124 14  -.02531781   1.0842124  .53759754  -.7715728    .8502372
    10134 "SGMA.1" "826613101"  22.29675 14  .074107975   1.0893105   .5432917  -.6325774   1.0000248
    10165 "SGMA.1" "826613101"  13.80275 14   -.4795731   1.0833185  -.3346466 -1.4110214   -.6623495
    10195 "SGMA.1" "826613101"  15.92625 14   .14310084    1.193245   .4185636  -.8900397    .7217689
    10226 "SGMA.1" "826613101" 13.537313 14  -.16251893   1.2294794   .4344717  -.9511412    .7128099
    10257 "SGMA.1" "826613101" 20.704124 14    .4248832   1.5129596  1.5841116 -1.4902065   1.6068647
    10286 "SGMA.1" "826613101"   40.3465 14    .6671717   1.4747086  1.5486495  -.5930199    2.430338
    10317 "SGMA.1" "826613101"     42.47 14    .0512933   1.4720894  1.5505384 -.57145756     2.45117
      942 "SGN.1"  "826622102"         . 11           .  -.03440913          . -.18337996   -.2177891
      973 "SGN.1"  "826622102"         . 11 -.020409487 -.022594307          . -.16534238   -.1879367
     1003 "SGN.1"  "826622102"         . 11  -.09737428  -.08488791          .  -.1639042   -.2487921
     1034 "SGN.1"  "826622102"         . 11  -.05273007  -.12402533          . -.12759694  -.25162226
     1064 "SGN.1"  "826622102"         . 11   .28923896  -.11930303          . -.15504125  -.27434427
     1095 "SGN.1"  "826622102" 269.90213 11  -.05040591 -.015629595          .  -.4026078   -.4182374
     1126 "SGN.1"  "826622102"  272.4482 11  .009389117    .8411043  -.3530496  -.8623545   -.3742998
     1154 "SGN.1"  "826622102" 259.71716 11  -.04785531    .7406401  -.3353857 -.43862915 -.033374727
     1185 "SGN.1"  "826622102" 261.04092 11  .005083915    .7578776 -.27877602  -.4187074   .06039423
     1215 "SGN.1"  "826622102" 263.48538 11  .009320753     .601351  -.2368395  -.9145672   -.5500558
     1246 "SGN.1"  "826622102"  294.0405 11   .10971962    .5988517  -.2211059  -.3845393 -.006793529
    end
    format %td datadate
    I figured that the first step is to calculate whole market cap in each month (market value of the whole stock market) but I'm jamming a bit here and have no idea how to proceed.
    Any help would be much appreciated!

  • #2
    The data you post as far as I can see is not usable for the purpose as it has only one company per month/year. Hence what follows is pseudo code.

    I think you

    1. firstly need to allocate the firms to deciles. Install user written -egenmore- (-findit egenmore-, then follow the instructions)
    Code:
     egen deciles = xtile(Quality) , nquantiles(10) by(datadate)
    this might take a while, -egen, xtile- is slow. You might want to go have a beer or eat something while Stata is calculating.

    2. Calculate your total market capitalisation by month, and your value weight

    Code:
    egen totalmktcap = total(ME), by(datadate)
    
    gen valueweight = ME/totalmktcap
    3. Finally calculate the value weighted returns by decile portfolio. Firstly install the user written weighted mean egen fucntion (-findit _gwtmean-, and follow instructions), then:
    Code:
     egen vwret = wtmean(return), weight(valueweight) by(datadate deciles)
    Last edited by Joro Kolev; 27 Sep 2020, 23:56.

    Comment


    • #3
      As pointed out by Joro, if your dataset is large, astile (from SSC) is then an order of magnitude faster for creating deciles. For some examples, please read this post
      https://fintechprofessor.com/create-...-using-astile/
      Regards
      --------------------------------------------------
      Attaullah Shah, PhD.
      Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
      FinTechProfessor.com
      https://asdocx.com
      Check out my asdoc program, which sends outputs to MS Word.
      For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

      Comment


      • #4
        Originally posted by Attaullah Shah View Post
        As pointed out by Joro, if your dataset is large, astile (from SSC) is then an order of magnitude faster for creating deciles. For some examples, please read this post
        https://fintechprofessor.com/create-...-using-astile/
        Thank you Joro Kolev and Attaullah Shah h, I will try both methods.

        Attaullah Shah is there any way that your method can incorporate the use of NYSE breakpoints? As is very common in these asset pricing papers (for example the Fama French 3/5 factor model) they use NYSE breakpoints to divide the decile portfolios, even though the portfolios themselves include not only NYSE stocks but also NASDAQ and AMEX stocks. Meaning that in my dataset (which includes NYSE/NASDAQ/AMEX stocks) I have to somehow manage to divide all these stocks in 10 portfolios but then each portfolio's size is dependent on only the NYSE stock's value, if that makes sense.

        If anyone has any other ideas, they are more than welcome to reply!
        I would love to know how to implement this in the stata commands. Thanks for your help

        Comment


        • #5
          I'm sorry, the above question on NYSE breakpoints is for Joro Kolev as the method proposed by Attaullah Shah's website already shows a way to incorporate NYSE breakpoints in forming the decile portfolios.
          In this way, I just want to be able to compare both of these methods in my dataset.

          Comment


          • #6
            Attaullah Shah has automated the method, I am not aware of another canned command that achieves what astile does in Example 3 of the link above in #3. I would say just go with astile, otherwise you will need to work more.

            If you want to do it manually and bypass astile, say because you want to verify the results, one possible procedure is as follows (pseudo code below, I do not have a workable datasample, do report back whether this works or not):

            1. Generate the break points
            Code:
            forvlaues i = 10(10)90 {
            egen break = pctile(Quality/(exchange=="NYSE")), p(`i') by(datadate)
            }
            We used the famous "Nick's approach of division by zero". For fun discussion you can see
            https://www.statalist.org/forums/for...e-per-subgroup
            but basically when exchange=="NYSE", the expression in the denominator evaluates to 1, so Quality/1 is just Quality. When exchange!="NYSE" the expression in the denominator we have 0, so division by zero results in Stata generating a missing value, that is, omitting the observation from the xtile calculation.

            Now we have 9 variables break10, break20..., break90 containing the break points based on NYSE

            2. We assign all the firms to the portfolios

            Code:
            gen thextiles = .
            
            local j = 1
            
            forvlaues i = 10(10)90 {
            replace thextiles = `j' if Quality<break`i' & missing(thextiles) 
            local ++j
            }
            
            replace thextiles = 10 if !missing(Quality) & missing(thextiles)








            Originally posted by Bob Ferguson View Post
            I'm sorry, the above question on NYSE breakpoints is for Joro Kolev as the method proposed by Attaullah Shah's website already shows a way to incorporate NYSE breakpoints in forming the decile portfolios.
            In this way, I just want to be able to compare both of these methods in my dataset.
            Last edited by Joro Kolev; 04 Oct 2020, 03:33. Reason: I am editing because I was wrong.

            Comment


            • #7
              Thank you very much Joro Kolev and Attaullah Shah. Using astile I was able to create 10 decile portfolios in each month in my dataset. Now since i need value-weighted portfolio returns I have used the code given by Joro as in:
              1. To create the portfolios sorted on the quality score of each stock in each month (Using NYSE breakpoints, exchg=1=NYSE) I used

              bysort datadate: astile QualityPortfolio = Quality, qc(exchg == 11) nq(10)
              2. Then to calculate value-weighted returns of these portfolios in each month I have used

              egen totalmktcap = total(ME), by(datadate)
              gen valueweight = ME/totalmktcap
              egen vwret = wtmean(return), weight (valueweight) by (datadate QualityPortfolio)
              Now this does give me returns for each month in my sample period (1962-2019). So that works great! But now I would like to draw the cumulative returns over the years of each decile portfolio in a line graph. So my first question is:
              1. How do I correctly calculate the cumulative returns of each portfolio (1-10) in my dataset for the whole sample period (1962-2019)?

              My second question is on how to graph these cumulative returns in a time-series line graph for each of the 10 portfolios.

              I have already set my data as time series data, using "tsset datadate" but when I use the twoway time-series line graph function is stata it tells me: "repeated time values in sample".

              Now it's true that there are repeated time values in my sample but this is normal since I have 10 portfolios in each month, so: in 31may2010 I have 10 portfolios (with each of them having a different return) and in 30june20210 I have 10 portfolios with 10 different returns

              I would just like to graph the cumulative returns of each decile portfolio over the years in a time series graph, but I cannot seem to find a way to do this using my dataset. Is it maybe necessary to declare my dataset as panel data or should I look for another way to achieve this?
              Thanks in advance
              Last edited by Bob Ferguson; 03 Nov 2020, 15:14.

              Comment


              • #8
                Joro Kolev and Attaullah Shah, first of all thank you for your discussion above, and for creating the astile command, respectively. I found the discussion above pretty useful for my purposes and I understand most of what is said above. Moreover, the astile command works much faster for my data then the alternative methods so I am very glad to be able to use it.


                I do have a question related to this thread which is related to the code quoted below.


                Originally posted by Joro Kolev View Post

                3. Finally calculate the value weighted returns by decile portfolio. Firstly install the user written weighted mean egen fucntion (-findit _gwtmean-, and follow instructions), then:
                Code:
                egen vwret = wtmean(return), weight(valueweight) by(datadate deciles)
                How is the weight applied to get the average return? For example if a decile is 9% of the total market capitalization at a certain date, does this change it's average return in comparison to a decile that is 11% of total market capitalization. Or does the weight only matter for averaging returns within each decile? So that a stock that is 1% of market capitalization within a decile has a return weight of 1% for computing that decile's average return.

                Thank you in advance.

                Kind regards, Julien Maas.

                Comment

                Working...
                X