Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Decile portfolios

    Dear Stata Experts,

    I am using the following command:

    ---

    sort YMD
    foreach X of varlist MktCap* {
    by YMD: egen p10_`X'= pctile(`X'), p(10.0)
    by YMD: egen p20_`X'= pctile(`X'), p(20.0)
    by YMD: egen p30_`X'= pctile(`X'), p(30.0)
    by YMD: egen p40_`X'= pctile(`X'), p(40.0)
    by YMD: egen p50_`X'= pctile(`X'), p(50.0)
    by YMD: egen p60_`X'= pctile(`X'), p(60.0)
    by YMD: egen p70_`X'= pctile(`X'), p(70.0)
    by YMD: egen p80_`X'= pctile(`X'), p(80.0)
    by YMD: egen p90_`X'= pctile(`X'), p(90.0)

    ---

    in Stata 13 on Windows in order to build 10 decile portfolios. YMD is the referring time variable and MktCap stands for market capitalization.

    My dataset is from CRSP and contains US stock data from 1963-2016. My goal is to create 10 portfolios. Portfolio 1 shall contain the stocks with the lowest market capitalization and portfolio 10 the ones with the highest market capitalization. Every year in June, I want to repeat the portfolio building. Unfortunately, my formula 1. is not complete, since I don't know exactly how to implement the request to only repeat the portfolio building in June in each year. 2. an error appears saying "invalid 'p' " after creating portfolio 4 when I use this formula.

    The relevant variables (columns) looks like:

    VMD MktCap
    166301 5.434
    166301 4.434
    166301 3.494
    166302 3.494
    166302 3.494
    166303 3.494
    166303 3.494
    166304 3.494

    Thank your for your help.

    Best,
    Guest
    Last edited by sladmin; 01 Jun 2017, 09:05. Reason: anonymize poster

  • #2
    Your chances of getting a helpful response would be improved if you would read and follow the advice in FAQ #12 about how to give example data and show code. In addition, the data you do show is confusing in the context of your explanation. It appears to contain a variable VMD which is nowhere mentioned earlier. It appears not to contain the date variable YMD which, judging from your statement of your problem, is critical to the problem. Perhaps VMD is just a typo for YMD? But then in what sense is YMD a date variable? It doesn't look like one, and if I ask Stata to display it with proper date formatting, these numbers correspond to dates in the year 2145. Also, your code calls for a loop over a series of variables whose names begin with MktCap, but your example data contains only a single variable MktCap. Very confusing.

    If you use -dataex- (see FAQ#12) to post a sample of your data that contains the relevant information, somebody will probably be able to help you.

    Comment


    • #3
      I tried to extract a small sample, but when I try to extract the data by the command dataex it always gives me a bunch of unformatted data. I am new to Stata and not so familiar with it. Please apologize the inconvenience. If I know how to extract only a small set of my data, I will post it here. Can anyone give me advice how to do that?

      Thanks!

      Comment


      • #4
        By default dataex won't give you more than values for 100 observations and it will spell out formats and labels for those variables (that's much of the point). So, the challenge at most is to pin down the variables you are going to show.

        Code:
        dataex VMD MktCap*
        should be enough if those are your variables of concern. Conversely if YMD is the real variable name, clearly you should use it.

        Comment


        • #5
          Thank you Nick. I used the code that you posted. My outcome was:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long YMD float MktCap
          1   .061875
          1      .111
          1  .1308125
          1   .254375
          1   .275625
          1   .277875
          1   .278875
          1  .2884375
          1  .2961875
          1      .351
          1 .36328125
          1    .41825
          1   .500625
          1    .51675
          1   .561875
          1      .599
          1   .746875
          1       .78
          1    .81125
          1   .826875
          end
          label values YMD YMD
          label def YMD 1 "196301", modify
          In my data editor the "YMD" observations are shown as e.g. 196301, 196302 (YYYYMM) and so on until Dec-2016. I don't know why this variable looks differently here.
          What I am trying to do is to allocate the stock data to 12 decile portfolios every month. The sort variable shall be market capitalization.

          And the next step is to calculate the average monthly return of each portfolio and give that as a new variable.

          Comment


          • #6
            So, the date information is in your value labels. There is a fix for that, but someone else may get to that before I do. I have to leave for a few hours.

            Comment


            • #7
              So, your YMD variable is really rather useless in its current form. What you have is an -encode-d string variable with value labels. The first step is to convert that to an actual Stata internal format date variable. In this case, it appears that your information is at the monthly level, not the daily level.

              The example data you show doesn't have any variable that identifies the different firms you are allocating to portfolios, so I've expanded the example data set to include that. You also only show data from a single month, and a non-June month at that, so we can't really do the portfolio generation based on that. But I've written code that I think will work--it isn't tested and may contain typos or other errors, but it will have the gist of it.

              One key idea in the code is that there is a difference between the portfolio functional year and the calendar year, because you re-set the portfolio each June. So it is necessary to capture that in the data structure by generating a functional year variable that starts in June and runs through the following May.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long YMD float MktCap long firm_id
              1   .061875  1
              1      .111  2
              1  .1308125  3
              1   .254375  4
              1   .275625  5
              1   .277875  6
              1   .278875  7
              1  .2884375  8
              1  .2961875  9
              1      .351 10
              1 .36328125 11
              1    .41825 12
              1   .500625 13
              1    .51675 14
              1   .561875 15
              1      .599 16
              1   .746875 17
              1       .78 18
              1    .81125 19
              1   .826875 20
              end
              label values YMD YMD
              label def YMD 1 "196301", modify
              
              
              //    GET A REAL DATE VARIABLE
              decode YMD, gen(string_date)
              replace string_date = substr(string_date, 1, 4)+"m"+substr(string_date, 5, .)
              gen YM = monthly(string_date, "YM")
              format YM %tm
              drop string_date YMD
              
              //    GET A "FUNCTIONAL" YEAR THAT RUNS FROM JUNE THROUGH MAY
              gen functional_year = year(dofm(YM-6))
              
              //    SAVE ORIGINAL DATA
              tempfile copy
              save `copy'
              
              //    KEEP ONLY THE JUNE DATA TO IDENTIFY PORTFOLIOS
              keep if month(dofm(YM)) == 6
              isid firm_id functional_year
              
              //    IDENTIFY PORTFOLIOS IN EACH YEAR
              levelsof functional_year, local(years)
              gen int portfolio = .
              foreach y of local years {
                  xtile pf = MktCap if functional_year == `y', nq(10)
                  replace portfolio = pf if functional_year == `y'
                  drop pf
              }
              
              //    NOW MERGE THESE RESULTS BACK WITH THE ORIGINAL DATA
              merge 1:m firm_id functional_year using `copy'
              Added: Crossed with Nick's #6.

              Comment


              • #8
                Thank you Clyde and Nick. I believe I have a working date variable in my dataset. (see attached). Furthermore, I have a variable that identifies the different firms. It is called permno (see attached as well). Hope that works.

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input long(date permno) float MktCap
                19630131 37647   .061875
                19630131 29233      .111
                19630131 32168  .1308125
                19630131 18745   .254375
                19630131 37057   .275625
                19630131 37380   .277875
                19630131 37372   .278875
                19630131 31245  .2884375
                19630131 38017  .2961875
                19630131 31405      .351
                19630131 30373 .36328125
                19630131 28812    .41825
                19630131 29364   .500625
                19630131 37903    .51675
                19630131 35457   .561875
                19630131 37831      .599
                19630131 30947   .746875
                19630131 29540       .78
                19630131 29217    .81125
                19630131 31253   .826875
                19630131 33080  .8778125
                19630131 29209    .91375
                19630131 30613  .9194375
                19630131 31712    .99275
                19630131 32766      1.02
                19630131 37487    1.0695
                19630131 37620 1.0704376
                19630131 29014   1.07625
                19630131 37292  1.093125
                19630131 32117    1.0965
                19630131 30066   1.09675
                19630131 33590    1.1025
                19630131 32432    1.1175
                19630131 35828  1.123375
                19630131 29954     1.125
                19630131 34702     1.125
                19630131 33697    1.1495
                19630131 33451  1.153125
                19630131 38236   1.19625
                19630131 36556   1.24875
                19630131 32328  1.286625
                19630131 36708     1.288
                19630131 29866 1.2972188
                19630131 33670  1.299375
                19630131 34243   1.30375
                19630131 30162     1.357
                19630131 35094  1.383375
                19630131 35473      1.44
                19630131 38471   1.48275
                19630131 30090 1.5121875
                19630131 36097     1.516
                19630131 34542   1.52625
                19630131 29823     1.548
                19630131 37890   1.54875
                19630131 33443  1.553125
                19630131 29639    1.5535
                19630131 34286     1.568
                19630131 35086     1.587
                19630131 37815     1.596
                19630131 28724   1.59775
                19630131 29591 1.6059375
                19630131 30306 1.6107812
                19630131 36302     1.617
                19630131 29786 1.6449375
                19630131 34921 1.6655625
                19630131 33662  1.677375
                19630131 31050     1.705
                19630131 30461   1.70775
                19630131 36345 1.7206875
                19630131 34446   1.72425
                19630131 30082      1.73
                19630131 36804    1.7355
                19630131 31237      1.75
                19630131 35545      1.75
                19630131 36142     1.764
                19630131 32408   1.76875
                19630131 35924    1.7925
                19630131 28919       1.8
                19630131 30138   1.80075
                19630131 28986    1.8135
                19630131 28986    1.8135
                19630131 28302   1.81425
                19630131 28660     1.827
                19630131 28687   1.83675
                19630131 34403    1.8655
                19630131 34577   1.87425
                19630131 36177 1.8829376
                19630131 35326     1.888
                19630131 28708     1.903
                19630131 37962   1.91475
                19630131 30904  1.921875
                19630131 30170      1.95
                19630131 37073     1.953
                19630131 30381     1.955
                19630131 32440      1.98
                19630131 35035    2.0475
                19630131 34788   2.05125
                19630131 37954  2.080875
                19630131 37436  2.082375
                19630131 28855     2.088
                end
                Clyde, I used the code you sent me. It seemed like it worked. But there were several errors (See below)

                . // GET A REAL DATE VARIABLE
                unrecognized command: / invalid command name
                r(199);

                .
                . decode YMD, gen(string_date)

                .
                . replace string_date = substr(string_date, 1, 4)+"m"+su
                > bstr(string_date, 5, .)
                string_date was str6 now str7
                (3939064 real changes made)

                .
                . gen YM = monthly(string_date, "YM")

                .
                . format YM %tm

                .
                . drop string_date YMD

                .
                .
                .
                . // GET A "FUNCTIONAL" YEAR THAT RUNS FROM JUNE THRO
                > UGH MAY
                unrecognized command: / invalid command name
                r(199);

                .
                . gen functional_year = year(dofm(YM-6))

                .
                .
                .
                . // SAVE ORIGINAL DATA
                unrecognized command: / invalid command name
                r(199);

                .
                . tempfile copy

                .
                . save `copy'
                file C:\Users\GUEST~1\AppData\Local\Temp\ST_06000001.t m
                > p saved

                .
                .
                .
                . // KEEP ONLY THE JUNE DATA TO IDENTIFY PORTFOLIOS
                unrecognized command: / invalid command name
                r(199);

                .
                . keep if month(dofm(YM)) == 6
                (3611361 observations deleted)

                .
                . isid firm_id functional_year
                variable firm_id not found
                r(111);

                .
                .
                .
                . // IDENTIFY PORTFOLIOS IN EACH YEAR
                unrecognized command: / invalid command name
                r(199);

                .
                . levelsof functional_year, local(years)
                1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1
                > 973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983
                > 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994
                > 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 200
                > 5 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

                .
                . gen int portfolio = .
                (327703 missing values generated)

                .
                . foreach y of local years {
                2.
                . xtile pf = MktCap if functional_year == `y', nq(10
                > )
                3.
                . replace portfolio = pf if functional_year == `y'
                4.
                . drop pf
                5.
                . }
                (2102 real changes made)
                (2156 real changes made)
                (2205 real changes made)
                (2220 real changes made)
                (2248 real changes made)
                (2268 real changes made)
                (2319 real changes made)
                (2424 real changes made)
                (2517 real changes made)
                (2665 real changes made)
                (5649 real changes made)
                (5233 real changes made)
                (5025 real changes made)
                (5063 real changes made)
                (5080 real changes made)
                (4957 real changes made)
                (4916 real changes made)
                (4947 real changes made)
                (5349 real changes made)
                (5451 real changes made)
                (5832 real changes made)
                (6459 real changes made)
                (6410 real changes made)
                (6571 real changes made)
                (7100 real changes made)
                (7161 real changes made)
                (6970 real changes made)
                (6890 real changes made)
                (6758 real changes made)
                (7012 real changes made)
                (7338 real changes made)
                (8160 real changes made)
                (8337 real changes made)
                (8867 real changes made)
                (9190 real changes made)
                (9188 real changes made)
                (8582 real changes made)
                (8463 real changes made)
                (7862 real changes made)
                (7328 real changes made)
                (6881 real changes made)
                (6803 real changes made)
                (6877 real changes made)
                (6900 real changes made)
                (7091 real changes made)
                (7046 real changes made)
                (6674 real changes made)
                (6663 real changes made)
                (6810 real changes made)
                (6774 real changes made)
                (6712 real changes made)
                (7007 real changes made)
                (7187 real changes made)
                (7166 real changes made)

                .
                .
                .
                . // NOW MERGE THESE RESULTS BACK WITH THE ORIGINAL D
                > ATA
                unrecognized command: / invalid command name
                r(199);

                .
                . merge 1:m firm_id functional_year using `copy'
                variable firm_id not found
                r(111);
                Last edited by sladmin; 01 Jun 2017, 09:58. Reason: anonymize poster

                Comment


                • #9
                  Clyde, please see below the new variables:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input float(YM functional_year) int portfolio
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  41 1962 1
                  end
                  format %tm YM
                  Thank you for your help! Can you tell me if it worked?
                  Last edited by sladmin; 01 Jun 2017, 09:06. Reason: anonymize poster

                  Comment


                  • #10
                    Hi Clyde, sry to post again, but I just saw that after using the code, which was provided by you, about 3.3 million observations were deleted and I just have stock data left that is reported in June of each year. The data for all the other months is deleted.

                    Comment


                    • #11
                      Anyone who can help me on this?

                      Thanks a lot guys!

                      Comment


                      • #12
                        Something is wrong with the way you executed the code. I think you typed it line-by-line into the command window. The command window does not accept comments (which begin with //) so it gave you all those messages about / not being a valid command every time it hit a comment. When people post code on this forum, it is generally intended that it will be copy/pasted into the do-file editor and run from there. The Command window is, at best, a clunky way to run an analysis, and it does not support some features of Stata code that most of us use freely here.

                        As for so many observations being deleted, that is exactly what was intended and is necessary for your purpose: it eliminates everything except the June observations so that you can form the portfolios. The -merge- command at the end is supposed to bring all of the other observations back. The reason the -merge- command failed is because it assumed there was a variable called firm_id: but there isn't. Evidently the variable needed her is permno. If you go back to the code in #2 and replace firm_id by permno throughout, and then run it from the do-editor, you will get what you seek.

                        Correction: the date variable in your -dataex- post is different from what you showed originally, so the first part of the code that calculates Stata date variables will not run correctly. Here's what you need to do from start to finish:

                        Code:
                        * Example generated by -dataex-. To install: ssc install dataex
                        clear
                        input long(date permno) float MktCap
                        19630131 37647   .061875
                        19630131 29233      .111
                        19630131 32168  .1308125
                        19630131 18745   .254375
                        19630131 37057   .275625
                        19630131 37380   .277875
                        19630131 37372   .278875
                        19630131 31245  .2884375
                        19630131 38017  .2961875
                        19630131 31405      .351
                        19630131 30373 .36328125
                        19630131 28812    .41825
                        19630131 29364   .500625
                        19630131 37903    .51675
                        19630131 35457   .561875
                        19630131 37831      .599
                        19630131 30947   .746875
                        19630131 29540       .78
                        19630131 29217    .81125
                        19630131 31253   .826875
                        19630131 33080  .8778125
                        19630131 29209    .91375
                        19630131 30613  .9194375
                        19630131 31712    .99275
                        19630131 32766      1.02
                        19630131 37487    1.0695
                        19630131 37620 1.0704376
                        19630131 29014   1.07625
                        19630131 37292  1.093125
                        19630131 32117    1.0965
                        19630131 30066   1.09675
                        19630131 33590    1.1025
                        19630131 32432    1.1175
                        19630131 35828  1.123375
                        19630131 29954     1.125
                        19630131 34702     1.125
                        19630131 33697    1.1495
                        19630131 33451  1.153125
                        19630131 38236   1.19625
                        19630131 36556   1.24875
                        19630131 32328  1.286625
                        19630131 36708     1.288
                        19630131 29866 1.2972188
                        19630131 33670  1.299375
                        19630131 34243   1.30375
                        19630131 30162     1.357
                        19630131 35094  1.383375
                        19630131 35473      1.44
                        19630131 38471   1.48275
                        19630131 30090 1.5121875
                        19630131 36097     1.516
                        19630131 34542   1.52625
                        19630131 29823     1.548
                        19630131 37890   1.54875
                        19630131 33443  1.553125
                        19630131 29639    1.5535
                        19630131 34286     1.568
                        19630131 35086     1.587
                        19630131 37815     1.596
                        19630131 28724   1.59775
                        19630131 29591 1.6059375
                        19630131 30306 1.6107812
                        19630131 36302     1.617
                        19630131 29786 1.6449375
                        19630131 34921 1.6655625
                        19630131 33662  1.677375
                        19630131 31050     1.705
                        19630131 30461   1.70775
                        19630131 36345 1.7206875
                        19630131 34446   1.72425
                        19630131 30082      1.73
                        19630131 36804    1.7355
                        19630131 31237      1.75
                        19630131 35545      1.75
                        19630131 36142     1.764
                        19630131 32408   1.76875
                        19630131 35924    1.7925
                        19630131 28919       1.8
                        19630131 30138   1.80075
                        19630131 28986    1.8135
                        19630131 28986    1.8135
                        19630131 28302   1.81425
                        19630131 28660     1.827
                        19630131 28687   1.83675
                        19630131 34403    1.8655
                        19630131 34577   1.87425
                        19630131 36177 1.8829376
                        19630131 35326     1.888
                        19630131 28708     1.903
                        19630131 37962   1.91475
                        19630131 30904  1.921875
                        19630131 30170      1.95
                        19630131 37073     1.953
                        19630131 30381     1.955
                        19630131 32440      1.98
                        19630131 35035    2.0475
                        19630131 34788   2.05125
                        19630131 37954  2.080875
                        19630131 37436  2.082375
                        19630131 28855     2.088
                        end
                        
                        //    GET A REAL DATE VARIABLE
                        gen int day = mod(date, 100)
                        replace date = (date - day)/100
                        gen int month = mod(date, 100)
                        replace date = (date - month)/100
                        rename date year
                        gen date = mdy(month, day, year)
                        format date %td
                        gen YM = mofd(date)
                        format YM %tm
                        
                        //    GET A "FUNCTIONAL" YEAR THAT RUNS FROM JUNE THROUGH MAY
                        gen functional_year = year(dofm(YM-6))
                        
                        //    SAVE ORIGINAL DATA
                        tempfile copy
                        save `copy'
                        
                        //    KEEP ONLY THE JUNE DATA TO IDENTIFY PORTFOLIOS
                        keep if month(dofm(YM)) == 6
                        isid permno functional_year
                        
                        //    IDENTIFY PORTFOLIOS IN EACH YEAR
                        levelsof functional_year, local(years)
                        gen int portfolio = .
                        foreach y of local years {
                            xtile pf = MktCap if functional_year == `y', nq(10)
                            replace portfolio = pf if functional_year == `y'
                            drop pf
                        }
                        
                        //    NOW MERGE THESE RESULTS BACK WITH THE ORIGINAL DATA
                        merge 1:m permno functional_year using `copy'
                        Note: The code is untested from -isid permno functional_year- down because the example data you gave contains no June entries, so there is nothing to test the creation of the portfolios with. But that is really the simpler part of the code. It may contain typos, but I'm pretty confident it's logically correct.


                        Last edited by Clyde Schechter; 21 May 2017, 20:55.

                        Comment


                        • #13
                          Dear Clyde,

                          Thank you for the code. Much appreciated. Unfortunately, it did not work. The error for the command
                          isid permno functional_year variables says:
                          “permno functional_year do not uniquely identify the observations”
                          The commands following the isid command are not executed after this error message. I hope there is a way around.

                          When I use the command isid for permno (which is stated as a unique permanent security-level identification number on the website of CRSP), Stata tells me that it does not uniquely identify the observations. Do you know a solution to that?

                          I will provide you with example data, which contains June observations. So, you can test the creation of the portfolios.

                          Unfortunately, I calculated the market capitalization (MktCap) wrongly. I am not supposed to calculate it for every stock in every month. What I am actually trying to do is the following:
                          1. For every stock in the data, calculate MktCap (shares outstanding (shrout) * price (altprc) ) on the last available / trading day of June in each year y and hold the value constant for the months from June of the same year y until May of year y+1, when MktCap is recalculated. Using this approach, the MktCap remains constant from any given June through the following May.
                          2. Calculate the natural logarithm of MktCap (same rule as in bullet 1), name the variable Size
                          3. In June of each year, compute decile breakpoints using only stocks that trade on the New York Stock Exchange (stocks, which have “1” as exchange code (see variable exchcd), and then use those breakpoints to sort all stocks in the sample (incl. stocks that trade on other exchange stocks) into 10 portfolios (e.g. 1 breakpoint = 10, data with MktCap <= 10 should go into the first portfolio)
                          4. Compute equal-weighted monthly returns on the portfolios for the next 12 months, from July in year y to June in year y+1 for the years 1963-2016. The returns for the single stocks can you find under the variable exret.
                          I hope it is clearer what I am trying to achieve with the data now. Thank you very much for your help!

                          Please find the referring variables and the example data below:

                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input long(permno date) byte exchcd long shrout float(altprc exret)
                          92241 19870630 3   2411   6.25    -.022
                          92241 19870529 3   2411   6.25   .04483
                          92241 19870831 3   2420   5.75 -.043277
                          92241 19870430 3   2411  5.875        .
                          92241 19870731 3   2420  5.875    -.081
                          92241 19870930 3   2420   5.75 -.003609
                          92241 19871231 1   2570 4.6875  .275103
                          92241 19880331 3   2545  4.375    -.019
                          92241 19880129 3   2545  4.375 -.081667
                          92241 19880229 3   2545  4.375    -.023
                          92241 19880429 3   2545  4.125 -.080143
                          92241 19871030 1   2570      4 -.331348
                          92241 19880531 3   2545      4 -.054303
                          92241 19871130 3   2570  3.625  -.11075
                          67619 19811030 1  10739 3.3125  .178558
                          67619 19810630 3  12247 2.6875     -.04
                          67619 19811130 3  10739 2.6875 -.241679
                          67619 19810930 1  10739 2.6875  -.03519
                          67619 19810630 3   8989  2.625  .074135
                          66966 19831230 3   3945  6.375  .074696
                          67619 19820129 3  12247 2.6875     -.04
                          67619 19810430 3   8989   2.75  .243118
                          end
                          Please note that MktCap is not included in the sample data, since my calculation was wrong (as stated above).

                          Best,
                          Guest
                          Last edited by sladmin; 01 Jun 2017, 09:06. Reason: anonymize poster

                          Comment


                          • #14
                            Anyone else who can help me with this?

                            Much appreciated!

                            Comment


                            • #15
                              Guest,

                              Your list of requests is growing rather long, and is probably more than can be handled in a Forum like this. Let's see if we can't get you through the portfolio selection process, and then you can try to take it from there.

                              The problem you are encountering, with the error message
                              variables permno functional_year do not uniquely identify the observations
                              means exactly what it says. Within the context of the data you sent, there are two observations for permno 67619 in June 1981. And they have different values for all of the variables you show. (And, frustratingly, the key variable MarketCap is not in your latest example!) So your whole portfolio proposal is undefinable: how would we know which of these contradictory observations to use to decide which decile permno 67619 belongs in in June 1981?

                              So there are several possibilities here:

                              1. Your data set is contaminated with surplus incorrect observations that aren't supposed to be there. If this is the case you need to eliminate them. You might do that by going back to the person who created this data set for you and either pointing out the errors and having them fix it, or that person might be able to explain to you how to select which of the multiple observations is the one that is appropriate for your purposes.

                              2. The data may be correct and there is some rule for deciding which of the multiple observations for a given permno on the same date is the one that is appropriate for the portfolio selection. Find out what that is, and then we can program that selection into the code before we get to that -isid- command.

                              3. Perhaps permno, in combination with some other variable, is the appropriate unique identifier here. Again, if we can find out what that is, we can build portfolios based on that combination of variables, rather than on permno's.

                              But at the end of the day, it isn't possible to group permno's into deciles of MarketCap on a given date if the same permno can have different values of MarketCap on that date.

                              You need to figure this out and fix your data set. Then post back with a corrected example of your data. Be sure to include June observations, and be sure to include the MarketCap variable.
                              Last edited by sladmin; 01 Jun 2017, 09:07. Reason: anonymize original poster

                              Comment

                              Working...
                              X