2-stage regression of 49 industry stock portfolios

Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#1

2-stage regression of 49 industry stock portfolios

19 Jun 2018, 05:51

Dear all,

I have time-series of returns for 49 different industry stock portfolios. I want to regress each of the 49 time-series against 4 factors (IP, MktRF, HML, SMB). After that, I need to regress the returns against all the constants and betas which resulted from the first regression. The generated new betas would be called λs. Finally, I want to aggregate the λs by calculating the mean. So in the end I should have the 5 mean values λ_const, λ_IP, λ_MktRF, λ_HML and λ_SMB.

I already tried the following (Please note that I will not include the variables for all 49 Portfolios, but just a sample):

Code:

foreach var of varlist Agric_eret Food_eret Soda_eret { asreg `var' IP MktRF HML SMB, fmb }

Now I get 49 regression tables. How can I procede with the aforementioned next steps? I could save each regression in a separate new .dta and merge them after that. But this seems not to be the most efficient solution.

Please see a data sample:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(date_adj Agric_eret Food_eret Soda_eret IP) double(MktRF HML SMB) 24 -3.66 -6.8 -100.23 -.8879781 -3.87 4.98 1.86 25 -4.2 -.07 -100.19 -.8879781 1.81 .89 -1.18 26 -13.01 1.58 -100.19 -.8879781 -.68 -1 .23 27 -1.98 -4.69 -100.21 -.8879781 -6.59 .48 -.99 28 -11.47 -11.24 -100.23 -.8879781 -8.65 2.32 -3.02 29 -10.68 -8.68 -100.19 -.8879781 -8.47 2.79 -.76 30 7.95 6.89 -100.26 -.8879781 6.28 -3.62 1.61 31 -.18 -.19 -100.22 -.8879781 2.13 -1.22 1.25 32 -6.15 -4.88 -100.2 -.8879781 -5.22 1.31 -2.49 33 -13.96 -2.57 -100.24 -.8879781 -.05 1.35 -4.01 34 15.12 11.95 -100.19 -.8879781 10.87 1.05 2.58 35 .55 2.65 -100.22 -.8879781 1.01 .34 -3.8 end format %tm date_adj

Any help would be much appreciated.

Kind regards,
Alex

Last edited by Alexander Schmidt; 19 Jun 2018, 06:30.
Tags: None
Devra Golbe

Join Date: Apr 2014

Posts: 170
#2

19 Jun 2018, 12:28

Hi Alex,

Here's one approach that might work (Not tested.)

Your data are in wide format. That is, you have observations X_it, where I identifies industry and t identifies time. All the data for each date are in a single row, and there are 49 return variables You could use the reshape command to rearrange them so that there is only one return variable (eret) per row, identified by date and an additional group variable which indicates the industry-- call it ind.

Then I think

Code:

statsby _b, by(ind): asreg eret IP MktRF HML SMB, fmb

will save all the coefficients for you

Best,
Devra

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
Comment

Alexander Schmidt

Join Date: Jun 2018
Posts: 25

20 Jun 2018, 06:01

Thank you very much Devra. I tried your approach. Please see my commands:

Code:

reshape long eret, i(date_adj) j(Ind, string)
label var Ind "Industry"
label var eret "Excess Return"

After reshaping, my data looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float date_adj str6 Ind float IP double(MktRF HML SMB) float eret
-115 "_Aero"  . -5.94  -.78 -2.38   -3.14
-115 "_Agric" . -5.94  -.78 -2.38   -4.54
-115 "_Autos" . -5.94  -.78 -2.38   -4.03
-115 "_Banks" . -5.94  -.78 -2.38  -10.13
-115 "_Beer"  . -5.94  -.78 -2.38   -2.27
-115 "_BldMt" . -5.94  -.78 -2.38   -8.61
-115 "_Books" . -5.94  -.78 -2.38   -7.89
-115 "_Boxes" . -5.94  -.78 -2.38  -11.14
-115 "_BusSv" . -5.94  -.78 -2.38   -6.16
-115 "_Chems" . -5.94  -.78 -2.38   -6.79
-115 "_Chips" . -5.94  -.78 -2.38   -5.66
-115 "_Clths" . -5.94  -.78 -2.38   -5.38
-115 "_Cnstr" . -5.94  -.78 -2.38  -13.23
-115 "_Coal"  . -5.94  -.78 -2.38   -7.72
-115 "_Drugs" . -5.94  -.78 -2.38   -3.67
-115 "_ElcEq" . -5.94  -.78 -2.38   -7.85
-115 "_FabPr" . -5.94  -.78 -2.38 -100.09
end
format %tm date_adj

Then I did the regression with the following commands:

Code:

statsby _b, by(Ind) clear: asreg eret IP MktRF HML SMB, fmb
foreach var of varlist _b_IP _b_MktRF _b_HML _b_SMB _b_cons {
egen mean_`var' = mean(`var')
}

The mean values are now closer to the values that I need to replicate So I guess your procedure is kind of the right direction. However they are still not close enough to be reliable. I think I made a mistake in the steps of the Fama-MacBeth regression. I went through a lot of threads in Statalist regarding this topic. So I summarized the steps of the Fama-MacBeth regression as following:

1. Run N time-series regressions.
2. Perform one cross-sectional regression, where the N coefficient estimates from (1) are your explanatory variables.
3. Repeat (1) and (2) going ahead in time to get a time-series of coefficient estimates from (2). Use this time-series to obtain the "average coefficient" and its standard error.

Based on my commands, does anybody find the mistake I did?

Any further help would be much appreciated.

Alex

Comment

Devra Golbe

Join Date: Apr 2014

Posts: 170
#4

20 Jun 2018, 11:55

Hi Alex,

You need to be specific about what you mean by a Fama-MacBeth regression, and in particular, you should give complete references. Maybe you are referring to the procedure described here:

Fama, Eugene F., and James D. MacBeth. “Risk, Return, and Equilibrium: Empirical Tests.” Journal of Political Economy, vol. 81, no. 3, 1973, pp. 607–636. JSTOR, JSTOR, www.jstor.org/stable/1831028.

Or maybe to the one described in the stata module xtfmb, written by Daniel Hoechle, from http://fmwww.bc.edu/RePEc/bocode/x

From a quick read, they do not appear the same to me.

Best,
Devra

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
Comment
Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#5

21 Jun 2018, 02:47

Hi Devra,

Yes, I refer to the paper you mentioned. Basically, I already did the first step of the FMB regression. Now I want to conduct the cross-sectional regression. The problem is that I don't have single stocks but 49 stock portfolios. Hence, I have 49 return data per date and I don't know how to consider them correctly in the regression. I illustrated the problem more thoroughly in this thread:

https://www.statalist.org/forums/for...ock-portfolios

Best,
Alex
Comment

Announcement