Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 2-stage regression of 49 industry stock portfolios

    Dear all,

    I have time-series of returns for 49 different industry stock portfolios. I want to regress each of the 49 time-series against 4 factors (IP, MktRF, HML, SMB). After that, I need to regress the returns against all the constants and betas which resulted from the first regression. The generated new betas would be called λs. Finally, I want to aggregate the λs by calculating the mean. So in the end I should have the 5 mean values λ_const, λ_IP, λ_MktRF, λ_HML and λ_SMB.

    I already tried the following (Please note that I will not include the variables for all 49 Portfolios, but just a sample):
    Code:
    foreach var of varlist Agric_eret Food_eret Soda_eret {
    asreg `var' IP MktRF HML SMB, fmb
    }
    Now I get 49 regression tables. How can I procede with the aforementioned next steps? I could save each regression in a separate new .dta and merge them after that. But this seems not to be the most efficient solution.

    Please see a data sample:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(date_adj Agric_eret Food_eret Soda_eret IP) double(MktRF HML SMB)
     24  -3.66   -6.8 -100.23   -.8879781 -3.87  4.98  1.86
     25   -4.2   -.07 -100.19   -.8879781  1.81   .89 -1.18
     26 -13.01   1.58 -100.19   -.8879781  -.68    -1   .23
     27  -1.98  -4.69 -100.21   -.8879781 -6.59   .48  -.99
     28 -11.47 -11.24 -100.23   -.8879781 -8.65  2.32 -3.02
     29 -10.68  -8.68 -100.19   -.8879781 -8.47  2.79  -.76
     30   7.95   6.89 -100.26   -.8879781  6.28 -3.62  1.61
     31   -.18   -.19 -100.22   -.8879781  2.13 -1.22  1.25
     32  -6.15  -4.88  -100.2   -.8879781 -5.22  1.31 -2.49
     33 -13.96  -2.57 -100.24   -.8879781  -.05  1.35 -4.01
     34  15.12  11.95 -100.19   -.8879781 10.87  1.05  2.58
     35    .55   2.65 -100.22   -.8879781  1.01   .34  -3.8
    end
    format %tm date_adj

    Any help would be much appreciated.

    Kind regards,
    Alex
    Last edited by Alexander Schmidt; 19 Jun 2018, 06:30.

  • #2
    Hi Alex,

    Here's one approach that might work (Not tested.)

    Your data are in wide format. That is, you have observations X_it, where I identifies industry and t identifies time. All the data for each date are in a single row, and there are 49 return variables You could use the reshape command to rearrange them so that there is only one return variable (eret) per row, identified by date and an additional group variable which indicates the industry-- call it ind.

    Then I think

    Code:
    statsby _b, by(ind): asreg eret IP MktRF HML SMB, fmb
    will save all the coefficients for you

    Best,
    Devra
    Devra Golbe
    Professor Emerita, Dept. of Economics
    Hunter College, CUNY

    Comment


    • #3
      Thank you very much Devra. I tried your approach. Please see my commands:
      Code:
      reshape long eret, i(date_adj) j(Ind, string)
      label var Ind "Industry"
      label var eret "Excess Return"
      After reshaping, my data looks like this:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float date_adj str6 Ind float IP double(MktRF HML SMB) float eret
      -115 "_Aero"  . -5.94  -.78 -2.38   -3.14
      -115 "_Agric" . -5.94  -.78 -2.38   -4.54
      -115 "_Autos" . -5.94  -.78 -2.38   -4.03
      -115 "_Banks" . -5.94  -.78 -2.38  -10.13
      -115 "_Beer"  . -5.94  -.78 -2.38   -2.27
      -115 "_BldMt" . -5.94  -.78 -2.38   -8.61
      -115 "_Books" . -5.94  -.78 -2.38   -7.89
      -115 "_Boxes" . -5.94  -.78 -2.38  -11.14
      -115 "_BusSv" . -5.94  -.78 -2.38   -6.16
      -115 "_Chems" . -5.94  -.78 -2.38   -6.79
      -115 "_Chips" . -5.94  -.78 -2.38   -5.66
      -115 "_Clths" . -5.94  -.78 -2.38   -5.38
      -115 "_Cnstr" . -5.94  -.78 -2.38  -13.23
      -115 "_Coal"  . -5.94  -.78 -2.38   -7.72
      -115 "_Drugs" . -5.94  -.78 -2.38   -3.67
      -115 "_ElcEq" . -5.94  -.78 -2.38   -7.85
      -115 "_FabPr" . -5.94  -.78 -2.38 -100.09
      end
      format %tm date_adj
      Then I did the regression with the following commands:
      Code:
      statsby _b, by(Ind) clear: asreg eret IP MktRF HML SMB, fmb
      foreach var of varlist _b_IP _b_MktRF _b_HML _b_SMB _b_cons {
      egen mean_`var' = mean(`var')
      }
      The mean values are now closer to the values that I need to replicate So I guess your procedure is kind of the right direction. However they are still not close enough to be reliable. I think I made a mistake in the steps of the Fama-MacBeth regression. I went through a lot of threads in Statalist regarding this topic. So I summarized the steps of the Fama-MacBeth regression as following:

      1. Run N time-series regressions.
      2. Perform one cross-sectional regression, where the N coefficient estimates from (1) are your explanatory variables.
      3. Repeat (1) and (2) going ahead in time to get a time-series of coefficient estimates from (2). Use this time-series to obtain the "average coefficient" and its standard error.

      Based on my commands, does anybody find the mistake I did?

      Any further help would be much appreciated.

      Alex

      Comment


      • #4
        Hi Alex,

        You need to be specific about what you mean by a Fama-MacBeth regression, and in particular, you should give complete references. Maybe you are referring to the procedure described here:

        Fama, Eugene F., and James D. MacBeth. “Risk, Return, and Equilibrium: Empirical Tests.” Journal of Political Economy, vol. 81, no. 3, 1973, pp. 607–636. JSTOR, JSTOR, www.jstor.org/stable/1831028.

        Or maybe to the one described in the stata module xtfmb, written by Daniel Hoechle, from http://fmwww.bc.edu/RePEc/bocode/x

        From a quick read, they do not appear the same to me.

        Best,
        Devra
        Devra Golbe
        Professor Emerita, Dept. of Economics
        Hunter College, CUNY

        Comment


        • #5
          Hi Devra,

          Yes, I refer to the paper you mentioned. Basically, I already did the first step of the FMB regression. Now I want to conduct the cross-sectional regression. The problem is that I don't have single stocks but 49 stock portfolios. Hence, I have 49 return data per date and I don't know how to consider them correctly in the regression. I illustrated the problem more thoroughly in this thread:

          https://www.statalist.org/forums/for...ock-portfolios

          Best,
          Alex

          Comment

          Working...
          X