Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Monthly Portfolio Returns

    Hi Statalist Experts,

    I am trying to do the following (see below) with my data, but unfortunately I am not able to implement it. Unfortunately, there is no unique identifier to merge two datasets, since throughout the time period the stocks and their personal permno code appear several times. I hope you can help me how to edit my data. I want to do the following:
    1. For every stock in the data, calculate MktCap (shares outstanding (shrout) * price (altprc) ) on the last available / trading day of June in each year y and hold the value constant for the months from June of the same year y until May of year y+1, when MktCap is recalculated. Using this approach, the MktCap remains constant from any given June through the following May.
    2. Calculate the natural logarithm of MktCap (same rule as in bullet 1), name the variable Size
    3. In June of each year, compute decile breakpoints using only stocks that trade on the New York Stock Exchange (stocks, which have “1” as exchange code (see variable exchcd), and then use those breakpoints to sort all stocks in the sample (incl. stocks that trade on other exchange stocks) into 10 portfolios (e.g. 1 breakpoint = 10, data with MktCap <= 10 should go into the first portfolio)
    4. Compute equal-weighted monthly returns on the portfolios for the next 12 months, from July in year y to June in year y+1 for the years 1963-2016 (my whole data time period). The returns for the single stocks can you find under the variable exret.
    Please find the referring variables and the example data below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long(permno date) byte exchcd long shrout float(altprc exret)
    92241 19870630 3   2411   6.25    -.022
    92241 19870529 3   2411   6.25   .04483
    92241 19870831 3   2420   5.75 -.043277
    92241 19870430 3   2411  5.875        .
    92241 19870731 3   2420  5.875    -.081
    92241 19870930 3   2420   5.75 -.003609
    92241 19871231 1   2570 4.6875  .275103
    92241 19880331 3   2545  4.375    -.019
    92241 19880129 3   2545  4.375 -.081667
    92241 19880229 3   2545  4.375    -.023
    92241 19880429 3   2545  4.125 -.080143
    92241 19871030 1   2570      4 -.331348
    92241 19880531 3   2545      4 -.054303
    92241 19871130 3   2570  3.625  -.11075
    67619 19811030 1  10739 3.3125  .178558
    67619 19810630 3  12247 2.6875     -.04
    67619 19811130 3  10739 2.6875 -.241679
    67619 19810930 1  10739 2.6875  -.03519
    67619 19810630 3   8989  2.625  .074135
    66966 19831230 3   3945  6.375  .074696
    67619 19820129 3  12247 2.6875     -.04
    67619 19810430 3   8989   2.75  .243118
    end
    Please note that MktCap is not included in the sample data, since my calculation was wrong (as stated above). Thank you very much for your help!

    Best,
    Guest
    Last edited by sladmin; 01 Jun 2017, 09:08. Reason: anonymize poster

  • #2
    This seems to be a continuation of the earlier thread at http://www.statalist.org/forums/foru...ile-portfolios.

    I note that the earlier thread included guidance on constructing useful Stata Internal Format date variables, but this data sample contains the less-than-useful 8-digit-number version of the date with which the earlier thread ended.

    Comment


    • #3
      Hi William,

      Yes, it is.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long(permno date) byte exchcd long shrout float(altprc exret)
      92241 19870630 1   2411   6.25    -.022
      92241 19870529 3   2411   6.25   .04483
      92241 19870831 3   2420   5.75 -.043277
      92241 19870430 3   2411  5.875        .
      92241 19870731 3   2420  5.875    -.081
      92241 19870930 3   2420   5.75 -.003609
      92241 19871231 3   2570 4.6875  .275103
      92241 19880331 1   2545  4.375    -.019
      92241 19880129 3   2545  4.375 -.081667
      92241 19880229 3   2545  4.375    -.023
      92241 19880429 3   2545  4.125 -.080143
      92241 19871030 3   2570      4 -.331348
      92241 19880630 2   2545      4 -.054303
      92241 19871130 3   2570  3.625  -.11075
      67619 19811030 3  10739 3.3125  .178558
      67619 19811231 3  12247 2.6875     -.04
      67619 19811130 2  10739 2.6875 -.241679
      67619 19810930 3  10739 2.6875  -.03519
      67619 19810630 3   8989  2.625  .074135
      66966 19831230 3   3945  6.375  .074696
      67619 19820129 3  12247 2.6875     -.04
      67619 19810430 1   8989   2.75  .243118
      end
      This data sample should contain the 8-digit-number version. Thank you for your help William!

      Comment


      • #4
        Guest,

        I don't understand your initial query ("there is no unique identifier to merge two datasets, since throughout the time period the stocks and their personal permno code appear several times.") Your example shows only one dataset, and I see no dups in the list. If you have true dups-- every variable is repeated for two or more observations-- then you could simply drop the duplicates. You might want to worry about why they are there in the first place; that might suggest a problem with your initial data draw. If they are only partial dups-- e.g., same date & permno, but different shrout, etc.-- then you need to resolve which observation is correct. That's not a Stata question.

        As for the rest, your question as posed in 1-4 sounds like something I might give as a final project in an empirical finance course. In fact, I would probably break down the large problem into the same component steps you have listed in order to point students in the right direction. Since you seem to need some guidance to progress further, I would reiterate what I think you were told in the previous thread.

        1. Take you data set (A) and save all the June observations into a separate dataset B. Use B to compute size and sort the stocks into deciles for each year.
        2. Merge those year-decile indicators back to dataset A.
        3. Compute the equal-weighted monthly returns on each portfolio.

        Hope this helps.
        Last edited by sladmin; 01 Jun 2017, 09:08. Reason: anonymize poster
        Devra Golbe
        Professor Emerita, Dept. of Economics
        Hunter College, CUNY

        Comment


        • #5
          Devra,
          I mentioned it since I exactly tried to do what you said before. But in order to execute the merger command it needs a unique identifier, doesn't it? I cannot delete the duplicates, since I need every stock data in the dataset. There are just the same stocks but at different times. I still need to include everyone of them.

          Can you provide me with some sample code so I can try to solve the issue?

          Thanks for your help!

          Comment


          • #6
            You need to do a one-to-many (or many-to-one) merge using year (July to June in your problem) and permno as your identifiers. Check the help for -merge- to see how to do this.
            Devra Golbe
            Professor Emerita, Dept. of Economics
            Hunter College, CUNY

            Comment


            • #7
              Devra,

              I tried to, but Stata says:

              . merge 1:m permno using `copy'
              variable permno does not uniquely identify
              observations in the master data

              I tried the code, which was provided by Clyde and tried to get the June data back to my master file. Unfortunately, it didn't work. Permno is a specific stock code. But throughout the data period one stock reports prices several times. Hence, the permno appears more than once and cannot function as an identifier.

              I used the following code:

              // GET A REAL DATE VARIABLE
              gen int day = mod(date, 100)
              replace date = (date - day)/100
              gen int month = mod(date, 100)
              replace date = (date - month)/100
              rename date year
              gen date = mdy(month, day, year)
              format date %td
              gen YM = mofd(date)
              format YM %tm

              // GET A "FUNCTIONAL" YEAR THAT RUNS FROM JUNE THROUGH MAY
              gen functional_year = year(dofm(YM-6))

              // SAVE ORIGINAL DATA
              tempfile copy
              save `copy'

              // KEEP ONLY THE JUNE DATA TO IDENTIFY PORTFOLIOS
              keep if month(dofm(YM)) == 6

              // IDENTIFY PORTFOLIOS IN EACH YEAR
              levelsof functional_year, local(years)
              gen int portfolio = .
              foreach y of local years {
              xtile pf = MktCap if functional_year == `y', nq(10)
              replace portfolio = pf if functional_year == `y'
              drop pf
              }

              // NOW MERGE THESE RESULTS BACK WITH THE ORIGINAL DATA
              merge m:1 permno using `copy'


              Comment


              • #8
                If you think about your merge command, the "master" dataset (the one in memory) should have one observation for each combination of permno and functional_year, and the "using" dataset will have observations for up to 12 months for each combination of permno and functional_year. This suggests that you should follow Prof. Golbe's advice from post 6 and correct Clyde's oversight by using
                Code:
                merge 1:m permno functional_year using `copy'
                and certainly not the command you show at the bottom of post #7, which instead of Clyde's code has
                Code:
                merge m:1 ...
                I also note that you have removed from Clyde's code
                Code:
                isid firm_id functional_year
                following the keep command, which confirms the assumption that there is only one observation for each combination of permno and functional year in the master dataset in memory. You should replace the isid command. And if it reports an error, then you have a problem where some permno for some functional_year had more than one observation with a June date. A possibility which Clyde could not have anticipated from your description of your data.
                Last edited by William Lisowski; 22 May 2017, 15:17.

                Comment


                • #9
                  You need to do a one-to-many (or many-to-one) merge using year (July to June in your problem) and permno as your identifiers.
                  Devra told you to do the above, but the code you posted in response to Devra only uses permno as the identifier for the merge.

                  But throughout the data period one stock reports prices several times. Hence, the permno appears more than once and cannot function as an identifier.
                  ... which is why you need to do the merge based on both permno AND your date variable.

                  Comment


                  • #10
                    This thread continues at #16 in http://www.statalist.org/forums/foru...rtfolios/page2, the thread in which it actually began.

                    I am hopeful that my comment there in #17 will solve Guest's problem once and for all.
                    Last edited by sladmin; 01 Jun 2017, 09:08. Reason: anonymize poster

                    Comment

                    Working...
                    X