Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to group panel data into multiple portfolio base on data and rank

    Hi all!

    I am doing analysis on short selling and return performance. My data set is a panel data containing 36 monthly short interest on 226 different stocks. I would like to group and form monthly portfolio basing on the rank in short interest ratio (eg, portfolio A to consist of top 20% most shorted stocks, as at January 2010). Would there be any solution to grouping and separating the the data base on (date) month and then by the their rank?

    Thanks a lot!

  • #2
    Something like this:
    Code:
    gen mdate = mofd(date)
    format mdate %tm
    
    by mdate (frequency_of_shorting), sort: gen byte portfolio_A = (_n <= 0.20*_N)
    This assumes that date is a Stata internal format daily date variable. If what you have is something else, you will need to convert it to that first. Replace frequency_of_shorting by whatever variable you have that indicates how often a stock is shorted. Also, if there are observations that are tied on the frequency of shorting and inclusion of all of them would make the portfolio exceed 20% of the observations, the ties will be broken arbitrarily and irreproducibly.

    Since no example data was provided (descriptions are not adequate for creating and testing code), this code is just a guess and may be wrong or inapplicable to your data setup. In the future, when requesting help with code, please show example data, using the -dataex- command. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      There have been a large number of similar questions asked and answered on Statalist. In addition to Clyde's answer, you need to search Statalist more seriously.

      Comment


      • #4
        Thanks Clyde for your help, apologies for the confusing question as I am still exploring the forum. Follow up with a more detailed description of my dataset, as shown below, I basically have a collection of monthly "short rate" and "return" data on 1000 stocks. My plan is to split up my data sort by month (for example portfolio 200601 to have all short rates as at Jan 2006) and obtaining 108(9 years sample size) subgroups horizontally (demonstrated below). Then I would sort the monthly subgroup (portfolios) by short rate and lastly to split up again into quintile and compute the average return (another variable) of each quintile.

        My ultimate aim is to somehow split the data up into small monthly subgroups and then look into the "average returns for different quintiles" of the supgroups. What I want to know is the "returns" for the 5 corresponding quintiles of " short_rate" throughout the whole sample period, with consideration that the "short_rate" is to be rebalanced monthly. I am sure if my approach is the most efficient and I would appreciate any advices and comments

        Again, thanks alot for the time and help!

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str8 tic long variable_for_other_program float(short_rate_test return)
        "BMY"    200601            0     .00003
        "CMCSA"  200601            0     .00003
        "KSU"    200601            0     .00003
        "HIW"    200601            0     .00003
        "HNZ"    200601            0     .00003
        "AMB"    200601            0     .00003
        "ORH"    200601     1.00e-08     .00004
        "EXC"    200601     1.00e-08     .00004
        "LNC"    200601     1.00e-08     .00004
        "EXC"    200601     7.00e-08      .0001
        "GXP"    200601     1.30e-07     .00016
        "NAV"    200601     1.50e-07     .00018
        "RGA"    200601     2.00e-07     .00023
        "VNO"    200601     2.50e-07     .00028
        "CBL"    200601     3.80e-07     .00041
        "TAP"    200601     4.00e-07     .00043
        "OFC"    200601     4.30e-07     .00046
        "PNC"    200601     4.90e-07     .00052
        "AEH"    200601     5.10e-07     .00054
        "EXC"    200601     5.20e-07     .00055
        "XEL"    200601     5.50e-07     .00058
        "XEL"    200601     6.00e-07     .00063
        "PFG"    200601     6.90e-07     .00072
        "STAR"   200601     7.80e-07     .00081
        "EQR"    200601     8.00e-07     .00083
        "MKC"    200601     9.70e-07       .001
        "PBI"    200601     9.90e-07     .00102
        "ORH"    200601     1.00e-06     .00103
        "AHL"    200601     1.00e-06     .00103
        "MS"     200601     1.00e-06     .00103
        "AEH"    200601     1.00e-06     .00103
        "STAR"   200601     1.21e-06     .00124
        "BEAM"   200601     1.22e-06     .00125
        "9466B"  200601     1.32e-06     .00135
        "PPS"    200601     1.73e-06     .00176
        "STZ.B"  200601     1.94e-06     .00197
        "EXC"    200601     1.98e-06     .00201
        "JPM"    200601     2.10e-06     .00213
        "DD"     200601     2.22e-06     .00225
        "GXP"    200601     2.30e-06     .00233
        "HL"     200601     2.42e-06     .00245
        "LXP"    200601     2.44e-06     .00247
        "XEL"    200601     2.66e-06     .00269
        "WPZ"    200601 2.712834e-06 .002742834
        "AFG"    200601     2.92e-06     .00295
        "DGAS"   200601     2.92e-06     .00295
        "KMG"    200601     2.98e-06     .00301
        "WRI"    200601     3.15e-06     .00318
        "CGG"    200601 3.270313e-06 .003300313
        "GXP"    200601     3.37e-06      .0034
        "MET"    200601     4.00e-06     .00403
        "UNM"    200601     4.00e-06     .00403
        "OHI"    200601     4.27e-06      .0043
        "DD"     200601     4.63e-06     .00466
        "XEL"    200601     4.74e-06     .00477
        "BDN"    200601     4.88e-06     .00491
        "PPS"    200601     5.00e-06     .00503
        "EQR"    200601     5.11e-06     .00514
        "KRC"    200601     5.20e-06     .00523
        "USM"    200601     5.44e-06     .00547
        "BAC"    200601     5.63e-06     .00566
        "ENH"    200601     5.63e-06     .00566
        "EPR"    200601     5.65e-06     .00568
        "XEL"    200601     5.66e-06     .00569
        "CLP"    200601     5.68e-06     .00571
        "STAR"   200601     5.84e-06     .00587
        "SLG"    200601     5.90e-06     .00593
        "VNO"    200601     6.11e-06     .00614
        "AMB"    200601     6.43e-06     .00646
        "WLB"    200601     6.97e-06       .007
        "BAC"    200601     7.03e-06     .00706
        "REG"    200601     7.20e-06     .00723
        "EQR"    200601     7.52e-06     .00755
        "LXP"    200601     7.76e-06     .00779
        "PNC"    200601     7.77e-06      .0078
        "STAR"   200601     7.92e-06     .00795
        "ENGY"   200601     7.94e-06     .00797
        "O"      200601     8.45e-06     .00848
        "NCOC"   200601     8.47e-06      .0085
        "RTU"    200601     8.87e-06      .0089
        "PNX"    200601     9.02e-06     .00905
        "MS"     200601     9.60e-06     .00963
        "BRE"    200601     .0000101     .01013
        "XEL"    200601      .000011     .01103
        "GFSI"   200601   .000011362    .011392
        "ARE"    200601    .00001207      .0121
        "OFC"    200601    .00001207      .0121
        "KRC"    200601    .00001249     .01252
        "SPG"    200601    .00001307      .0131
        "RBS.PX" 200601    .00001644     .01647
        "CBL"    200601    .00001743     .01746
        "EMMS"   200601    .00001748     .01751
        "STAR"   200601    .00001867      .0187
        "RBS.PX" 200601    .00002091     .02094
        "YORW"   200601    .00002208     .02211
        "RBS.PX" 200601    .00002226     .02229
        "AMYRZ"  200601     .0000223     .02233
        "RBS.PX" 200601    .00002252     .02255
        "GSLA"   200601    .00002316     .02319
        "BAC"    200601      .000024     .02403
        end
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str5 A str6 B str10 C str6 D byte E str5 F str6 G str10 H str6 I
        "Tic"   "Data"   "short_rate" "return" . "Tic"   "Data"   "short_rate" "return"
        "BMY"   "200601" "0"          "0"      . "BMY"   "200602" "0"          "0"     
        "CMCSA" "200601" "0"          "0"      . "CMCSA" "200602" "0"          "0"     
        end

        Comment


        • #5
          OK. In your first example data, there is only a single month represented. So this code is not thoroughly tested, but it works in the example and I believe it is correct:
          Code:
          // FIRST CREATE A REAL STATA MONTHLY DATE VARIABLE
          numdate monthly mdate = variable_for_other_program, pattern("YM")
          assert !missing(mdate)
          
          //    CALCULATE QUINTILES OF SHORT_RATE_TEST IN EACH MONTH
          capture program drop one_month
          program define one_month
              xtile quintile = short_rate_test, nq(5)
              exit
          end
          
          runby one_month, by(mdate)
          
          //    CALCULATE MEAN RETURN IN EACH QUINTILE-MONTH
          by mdate quintile, sort: egen mean_return = mean(return)
          To run this you will need to install two community-contributed programs from SSC. -numdate- is by Nick Cox, -runby- is by Robert PIcard and me.

          I don't understand what the second data example is about. It seems to look like some sort of data import from a spreadsheet, with the data in wide form. That's not usable for your purpose; the layout in the first example is what you should use.

          Comment


          • #6
            Thanks for the help Clyde! The code works perfectly and you are right about the second example being inappropriate. Thanks a lot.

            Comment

            Working...
            X