Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating percentiles and portfolio through xtile

    Dear Statalist,

    I am new to Stata and trying to make percentiles of percentiles through xtile, but I do still not know how to do it exactly. I have a panel data with the ID of the company (companyID) and the year (year), revenues (rev) and a variable cost of goods sold (cogs).

    What I want to do is the following:

    1. Sort stocks into two revenues groups (Big stocks are those in the top 90% revenues, and small stocks are those in the bottom 10%)

    2. Sort stocks into three cost of good sold groups (breakpoints are 30th and 70th percentiles)

    3. Sort stocks further with the independent 2x3 sorts on revenues and cost of goods sold produce six portfolios, SG, SN, SV, BG, BN, and BV, where S and B indicate small or big and G, N, and V indicate growth (low cost of goods sold), neutral, and value (high cost of goods sold).


    Any help would be appreciated !

    Thanks in advance,

    Greetings!

  • #2
    Welcome to the Stata Forum / Statalist,

    Please read the FAQ, particularly on how to share command and data. This is the best way to entail helpful replies. Thanks.
    Best regards,

    Marcos

    Comment


    • #3
      Originally posted by Marcos Almeida View Post
      Welcome to the Stata Forum / Statalist,

      Please read the FAQ, particularly on how to share command and data. This is the best way to entail helpful replies. Thanks.
      Hello Marcos,

      Thank you for responding, my data can be shown as follows (through the command dataex)

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long companyID double(year cogs rev)
       29194 2000      1173      3333
       29194 1999       996      3280
      201336 2011   317.406   619.073
      201336 2010   210.555   479.059
       29194 2014  2446.455  3909.735
       29194 2010  1563.174  3696.172
      201336 2008    45.047     53.86
      201336 2009   192.985   376.647
      201336 2007    29.269    35.727
      201336 2012   433.082   766.597
       14802 1997  5734.747     11724
      100243 2012  9860.236 12480.652
       14802 2001   6898.68   12951.1
       14802 1999  6227.712 12813.811
       14802 1996  5768.638 11818.132
      100243 2013  8939.358 11387.787
      100165 2011  1070.206  2676.536
       14802 1995  4540.552  9646.976
      100243 2011 10493.535  13320.88
      100243 2007  6710.098  8457.121
      100243 2004  5551.799  7234.182
       14802 2008 24494.496 32735.998
       17525 1994  2812.274  3396.893
      100243 2005  6394.245  8455.675
      100243 2010  6489.489  8352.376
       14802 2003  7010.426  15406.08
       14802 2002  5401.855 11691.895
       14802 2004   8705.84 17544.244
      100165 2009   728.197  2008.602
       14802 1994   2900.06  7575.379
       14802 2015 12131.262 25702.326
      100243 2014  7913.117   10231.5
       14802 2009 20161.958 31610.215
      100243 2006  6574.329  8196.402
       17525 1997   3399.24  3571.669
       14802 2013 18469.134 33859.145
       14802 2011 22397.558 37104.178
      100165 2013  1454.535  3243.013
       14802 2012 24966.519 40528.782
       14802 1998  6013.613  11492.34
       14802 2010 19626.076 33331.264
       14802 2014 15633.988 31703.953
       14802 2000  6353.961 13017.661
       17525 1995  2606.325  3506.744
       17525 1996   2936.05  3688.606
      100165 2014   2512.38   3329.68
       14802 2005 10947.833 21411.864
      100165 2010   877.578  2320.975
      100165 2012  1477.884  3394.354
      177925 2014    1865.1      2851
      241637 2015     14098     43604
      177925 2015    1700.4    2627.5
      241637 2014     15565     47063
      177925 2013    1819.3    2720.5
      153130 2015  2894.311  3850.309
      177983 2015  1445.744  2461.044
      100424 2014     29568     40487
      165262 2014    92.095   150.789
      175688 2013    1035.8    1219.1
      177983 2013  1271.658  2131.997
      153130 2014  2923.919  4346.523
      177983 2012  1108.957  1901.971
      138861 2014   262.603   585.715
      100424 2012     29702     40192
       65500 2009    52.098    124.37
      100424 2005      8882     12285
      138861 2006   232.841   492.804
      178552 2012    74.243   487.094
      138861 2011   324.228   588.559
      138861 2015   243.609   551.385
      178552 2014    85.625   563.091
      177983 2014  1333.546  2279.438
      100424 2007     14366     19877
      138861 2007   278.778   578.415
      162387 2013   102.806   399.276
      162387 2014    99.129   402.928
      175688 2014    1197.6      1265
       65500 2008    45.643       228
      100424 2011     28454     38455
      100424 2010     21852     30862
       65500 2010    47.113   126.416
      165262 2015   128.659   365.114
      138861 2010    342.37   571.942
       64355 2010    30.145    95.897
      152249 2007  1056.657   1838.09
      138861 2012   245.211   545.418
      152249 2015  1264.993  2111.083
      138861 2013   267.891   594.081
      100424 2009     16148     22923
      138861 2009    223.65   456.956
      178552 2013    61.712   528.973
      153130 2013  2798.386  4196.365
      100424 2013     29624     39997
       65500 2015   225.245   445.738
      100424 2015     27431     37770
      138861 2005   205.291   430.647
      205942 2014  7331.752 10946.013
      222111 2010     74805    120052
      222111 2005     30783     56324
      209382 2008 12410.268 30431.839
      end

      I look forward to your reply.

      Comment


      • #4
        I wonder whether you wish something like this:

        Code:
        . xtile myxtile = rev, nquantiles(10)
        
        . gen bigstock = myxtile
        
        . recode bigstock (1 = 1) (2/8 =2) (9/10 = 3)
        (bigstock: 80 changes made)
        
        . label define big 1 "small" 2 "medium" 3 "big"
        
        . label values bigstock big
        
        . tab bigstock
        
           bigstock |      Freq.     Percent        Cum.
        ------------+-----------------------------------
              small |         10       10.00       10.00
             medium |         70       70.00       80.00
                big |         20       20.00      100.00
        ------------+-----------------------------------
              Total |        100      100.00
        Best regards,

        Marcos

        Comment


        • #5
          Originally posted by Marcos Almeida View Post
          I wonder whether you wish something like this:

          Code:
          . xtile myxtile = rev, nquantiles(10)
          
          . gen bigstock = myxtile
          
          . recode bigstock (1 = 1) (2/8 =2) (9/10 = 3)
          (bigstock: 80 changes made)
          
          . label define big 1 "small" 2 "medium" 3 "big"
          
          . label values bigstock big
          
          . tab bigstock
          
          bigstock | Freq. Percent Cum.
          ------------+-----------------------------------
          small | 10 10.00 10.00
          medium | 70 70.00 80.00
          big | 20 20.00 100.00
          ------------+-----------------------------------
          Total | 100 100.00
          Thank you for your reply. The point however is that I would like to have 2x3 sorts on rev and cogs. I want to sort stocks into two rev and three cogs groups at the end of each year. Big stocks are those in the top 90% of rev, and small stocks are those in the bottom 10%. The cogs breakpoint are the 30th and 70th percentiles of B/M.

          The independent 2x3 sorts on rev and cogs produce six equally weighted portfolios, SG, SN, SV, BG, BN, and BV, where S and B indicate small (low rev) or big (high rev) and G, N, and V indicate growth (low cogs), neutral, and value (high cogs).

          English isn't my native language, so if I am in some way unclear then I would like to try to clarify myself in a different way.


          With kind greetings,

          Thanks in advance!

          Comment


          • #6
            Can anybody help? :D

            Comment


            • #7
              Michel:
              elaborating on Marcos' helpful code, you may want to try:
              Code:
              . xtile myxtile = rev, nquantiles(10)
              
              . gen bigstock = myxtile
              
              . recode bigstock (1 = 1) (2/8 =2) (9/10 = 3)
              (bigstock: 80 changes made)
              
              . label define big 1 "small" 2 "medium" 3 "big"
              
              . label values bigstock big
              
              .  tab bigstock
              
                 bigstock |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                    small |         10       10.00       10.00
                   medium |         70       70.00       80.00
                      big |         20       20.00      100.00
              ------------+-----------------------------------
                    Total |        100      100.00
              
              . xtile myxtile_2 = cogs , nquantiles(10)
              
              . . gen cogsstock = myxtile_2
              
              . recode cogsstock (1/3 = 1) (4/6 = 2) (7/10 = 3)
              (cogsstock: 90 changes made)
              
              . label define cogsstock 1 "low" 2 "neutral" 3 "high"
              
              . label val cogsstock cogsstock
              
              .  tab cogsstock
              
                cogsstock |      Freq.     Percent        Cum.
              ------------+-----------------------------------
                      low |         30       30.00       30.00
                  neutral |         30       30.00       60.00
                     high |         40       40.00      100.00
              ------------+-----------------------------------
                    Total |        100      100.00
              
              . g portfolio=1 if bigstock==1 & cogsstock==1
              (90 missing values generated)
              
              . replace portfolio=2 if bigstock==1 & cogsstock==2
              (0 real changes made)
              
              . replace portfolio=3 if bigstock==1 & cogsstock==3
              (0 real changes made)
              
              . replace portfolio=4 if bigstock==3 & cogsstock==1
              (0 real changes made)
              
              . replace portfolio=5 if bigstock==3 & cogsstock==2
              (0 real changes made)
              
              . replace portfolio=6 if bigstock==3 & cogsstock==3
              (20 real changes made)
              
              . label define portfolio 1 "SG" 2 "SN" 3 "SV" 4 "BG" 5 "BN" 6 "BV"
              
              . label val portfolio portfolio
              
              . tab portfolio
              However, as far as I can get your query, your classification criteria exclude most of the stocks in your dataset.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Originally posted by Carlo Lazzaro View Post
                Michel:
                elaborating on Marcos' helpful code, you may want to try:
                Code:
                . xtile myxtile = rev, nquantiles(10)
                
                . gen bigstock = myxtile
                
                . recode bigstock (1 = 1) (2/8 =2) (9/10 = 3)
                (bigstock: 80 changes made)
                
                . label define big 1 "small" 2 "medium" 3 "big"
                
                . label values bigstock big
                
                . tab bigstock
                
                bigstock | Freq. Percent Cum.
                ------------+-----------------------------------
                small | 10 10.00 10.00
                medium | 70 70.00 80.00
                big | 20 20.00 100.00
                ------------+-----------------------------------
                Total | 100 100.00
                
                . xtile myxtile_2 = cogs , nquantiles(10)
                
                . . gen cogsstock = myxtile_2
                
                . recode cogsstock (1/3 = 1) (4/6 = 2) (7/10 = 3)
                (cogsstock: 90 changes made)
                
                . label define cogsstock 1 "low" 2 "neutral" 3 "high"
                
                . label val cogsstock cogsstock
                
                . tab cogsstock
                
                cogsstock | Freq. Percent Cum.
                ------------+-----------------------------------
                low | 30 30.00 30.00
                neutral | 30 30.00 60.00
                high | 40 40.00 100.00
                ------------+-----------------------------------
                Total | 100 100.00
                
                . g portfolio=1 if bigstock==1 & cogsstock==1
                (90 missing values generated)
                
                . replace portfolio=2 if bigstock==1 & cogsstock==2
                (0 real changes made)
                
                . replace portfolio=3 if bigstock==1 & cogsstock==3
                (0 real changes made)
                
                . replace portfolio=4 if bigstock==3 & cogsstock==1
                (0 real changes made)
                
                . replace portfolio=5 if bigstock==3 & cogsstock==2
                (0 real changes made)
                
                . replace portfolio=6 if bigstock==3 & cogsstock==3
                (20 real changes made)
                
                . label define portfolio 1 "SG" 2 "SN" 3 "SV" 4 "BG" 5 "BN" 6 "BV"
                
                . label val portfolio portfolio
                
                . tab portfolio
                However, as far as I can get your query, your classification criteria exclude most of the stocks in your dataset.
                Hello Carlo,

                Thank you for your fast response, may I ask what the intuition is of
                Code:
                 recode bigstock (1 = 1) (2/8 =2) (9/10 = 3)
                and
                Code:
                  
                 recode cogsstock (1/3 = 1) (4/6 = 2) (7/10 = 3)
                Should not it be 3/10 for the lowest group for cogs and 1/10 for the lowest group of rev? Now the total percentage of the first recode is 1+ 2/8 + 9/10 = 2.15 and the second recode: 1/3 + 4/6 + 7/10 = 1.7 and both are higher than 1 (100%) ?!


                Big group of rev needs to be top 10% instead of 20% and the top group of cogs needs to be top 30% instead of the top 40%, Or is my intuition wrong?
                Last edited by Michel Loki; 07 May 2017, 07:28.

                Comment


                • #9
                  Michel:
                  you probably misunderstood -recode- syntax; please see -help recode-.
                  As you can see, in -tab bigstock- and in -tab cogsstock- frequencies summed up to 100 (or 100%).
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Cross-posted at http://stackoverflow.com/questions/4...-through-xtile

                    Reading the FAQ as urged by Marcos in #2 discloses our policy on cross-posting, which is that you are asked to tell us about it. .

                    Comment


                    • #11
                      Originally posted by Carlo Lazzaro View Post
                      Michel:
                      you probably misunderstood -recode- syntax; please see -help recode-.
                      As you can see, in -tab bigstock- and in -tab cogsstock- frequencies summed up to 100 (or 100%).
                      I have taken a look at it. I understand it now.

                      You are a true hero, thank you!

                      Comment


                      • #12
                        Michel:
                        far from being anybody's hero, I'm still learning how Stata works.
                        Kind regards,
                        Carlo
                        (Stata 19.0)

                        Comment

                        Working...
                        X