Fama-MacBeth cross-sectional regression with 49 stock portfolios

Alexander Schmidt

Join Date: Jun 2018
Posts: 25

Fama-MacBeth cross-sectional regression with 49 stock portfolios

20 Jun 2018, 10:20

Dear all,

The Fama-MacBeth (FMB) procedure has been discussed in several threads. However, I can't find a thread how to apply this procedure to multiple stock portfolios instead of single stocks. I try to formulate my question to focus on the execution in Stata. Hence, someone not familiar with FMB might also have an idea how to solve the problem.

My data looks as following:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str6 Ind float(_b_IP _b_MktRF _b_HML _b_SMB date_adj eret)
"_Aero" .04750393 1.1349244 .3038464 .1914637 -115 -3.14
"_Aero" .04750393 1.1349244 .3038464 .1914637 -114 15.82
"_Aero" .04750393 1.1349244 .3038464 .1914637 -113 -3.21
"_Aero" .04750393 1.1349244 .3038464 .1914637 -112 -1.93
"_Aero" .04750393 1.1349244 .3038464 .1914637 -111  -.14
"_Aero" .04750393 1.1349244 .3038464 .1914637 -110  8.43
"_Aero" .04750393 1.1349244 .3038464 .1914637 -109  12.8
"_Aero" .04750393 1.1349244 .3038464 .1914637 -108  1.79
"_Aero" .04750393 1.1349244 .3038464 .1914637 -107  2.97
"_Aero" .04750393 1.1349244 .3038464 .1914637 -106 -8.63
"_Aero" .04750393 1.1349244 .3038464 .1914637 -105  5.23
"_Aero" .04750393 1.1349244 .3038464 .1914637 -104 -3.41
"_Aero" .04750393 1.1349244 .3038464 .1914637 -103 -9.01
"_Aero" .04750393 1.1349244 .3038464 .1914637 -102 10.55
"_Aero" .04750393 1.1349244 .3038464 .1914637 -101  4.83
"_Aero" .04750393 1.1349244 .3038464 .1914637 -100  4.52
"_Aero" .04750393 1.1349244 .3038464 .1914637  -99 -1.87
"_Aero" .04750393 1.1349244 .3038464 .1914637  -98 -2.93
"_Aero" .04750393 1.1349244 .3038464 .1914637  -97  1.86
"_Aero" .04750393 1.1349244 .3038464 .1914637  -96  4.32
"_Aero" .04750393 1.1349244 .3038464 .1914637  -95  -4.6
"_Aero" .04750393 1.1349244 .3038464 .1914637  -94   .95
"_Aero" .04750393 1.1349244 .3038464 .1914637  -93 -6.64
"_Aero" .04750393 1.1349244 .3038464 .1914637  -92  7.22
"_Aero" .04750393 1.1349244 .3038464 .1914637  -91  7.13
"_Aero" .04750393 1.1349244 .3038464 .1914637  -90   .29
"_Aero" .04750393 1.1349244 .3038464 .1914637  -89  1.05
"_Aero" .04750393 1.1349244 .3038464 .1914637  -88 -1.95
"_Aero" .04750393 1.1349244 .3038464 .1914637  -87  1.61
"_Aero" .04750393 1.1349244 .3038464 .1914637  -86  5.03
end
format %tm date_adj

I have to run a regression for each of the single 49 industry portfolios for each single month. So that would be 49x700 regressions. Thereby, I regress eret against _b_IP, _b_MktRF, _b_HML and _b_SMB. The values for the variables _b_IP, _b_MktRF, _b_HML and _b_SMB for each industry portfolio (variable "Ind") are constant over time (variable "date_adj"). After that I calculate the mean for the new _b values, so that I should have 4 different mean betas overall in the end. Actually, I think the following command would be appropriate:

Code:

bys date_adj Ind: asreg eret _b_IP _b_MktRF _b_HML _b_SMB, fmb

This command should do all the steps from regressing to aggregating. However, the problem is that I have multiple values for eret per Ind. How can I consider this in my approach?

Any help would be much appreciated.

Best regards,
Alex

Tags: None

Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#2

21 Jun 2018, 06:06

I tried a new approach:

Code:

statsby _b, by(date_adj) clear: reg eret _b_IP _b_MktRF _b_HML _b_SMB

Do I understand the command correctly: This procedure performs single regressions for each of the various eret data points per date and aggregates the values so that I have series with exactly T observations each?
I also tried to perform the FMB approach in just one step by:

Code:

statsby _b, by(date_adj) clear: asreg eret _b_IP _b_MktRF _b_HML _b_SMB, fmb

Unfortunately, I only get "xxxxxxxxxxxx" displayed in Stata.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35760
#3

21 Jun 2018, 06:17

Comments on various levels:

asreg (SSC) is a community-contributed command from Attaullah Shah, as you are asked to explain (FAQ Advice #12). Attaullah is a member of Statalist but he evidently doesn't check posts here daily, so your best bet for support may be to contact him directly.

I won't try to answer on Fama-MacBeth, having never used it or even tried to understand it, but it is no part of statsby to imply or pretend that your subsets are all of full or even equal length. Rather, you would get one observation for each subset in a dataset of results.
Comment
Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#4

21 Jun 2018, 07:14

Thanks for your answer Nick. But how does statsby actually aggregate the _b values? At first I had multiple rows of values per industry. I performed the command:

Code:

statsby _b, by(Ind) clear: reg eret IP MktRF HML SMB

Then I got 1 row of values per industry (which is exactly what I need). Does statsby just calculate the mean in this case?

Last edited by Alexander Schmidt; 21 Jun 2018, 07:18.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35760
#5

21 Jun 2018, 09:29

No; your example concerns regression, so there is one observation per industry and with your syntax all you get are the coefficients from each regression in each observations. You can do stuff starting like

Code:

statsby nobs=e(N) _b,

and get other results saved, here the number of observations used as well as the coefficients. Alternatively, see e.g. rangestat (SSC) for a different set of defaults.
Comment
Devra Golbe

Join Date: Apr 2014

Posts: 170
#6

21 Jun 2018, 11:18

Alex,

Don't know why you started a new thread, since this seems to me to be the same discussion as your previous thread. In any case, You will get more helpful responses if you state your questions more clearly (perhaps for yourself as well as the list.)
You write

I have to run a regression for each of the single 49 industry portfolios for each single month. So that would be 49x700 regressions. Thereby, I regress eret against _b_IP, _b_MktRF, _b_HML and _b_SMB. The values for the variables _b_IP, _b_MktRF, _b_HML and _b_SMB for each industry portfolio (variable "Ind") are constant over time (variable "date_adj"). After that I calculate the mean for the new _b values, so that I should have 4 different mean betas overall in the end.

It sounds to me like you have daily data on 49 industry returns for 700 month. So for each of the 49 industries, you estimate 700 separate regressions, one for each month. If that is indeed what you do, you will have 49x700 separate sets of coefficients. Within an industry, the coefficients [_b_IP, _b_MktRF, _b_HML and _b_SMB] will differ by month-- they are not "constant over time." They will also differ across industries, even for the same time (month.)

Am I misunderstanding?

Devra Golbe
Professor Emerita, Dept. of Economics
Hunter College, CUNY
Comment
Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#7

24 Jun 2018, 10:46

Nick, Devra thank you both for your answers!

Nick, I think I didn't express myself right. At first I had multiple rows of values per industry. For each row I have to run a separate regression. After performing the mentioned command I didn't get regression betas for each different row per industry. I rather ended up with 1 row of beta values per industry (which is exactly what I need). So I am wondering if statsby calculated the betas for each of the different rows per industry and then aggregated the betas per industry.

Devra, you mainly understood it correct. I have monthly return data for 49 industry portfolios. My sample comprises 700 months overall. In line with my understanding of the FMB approach, I first regressed the excess return for each portfolio for each month against the 4 variables IP MktRF HML SMB:

Code:

statsby _b, by(Ind) clear: reg eret IP MktRF HML SMB

Actually, I should calculate 72-months rolling window betas. I am still figuring out how to integrate this in the command. However, then I have 49x b_IP, _b_MktRF _b_HML and _b_SMB. Now I want to regress the excess return of each portfolio against the beta values of the portfolios and aggregate the new betas. I want to do this for all 700 months. An example:
Automotive:
Regress eret against b_IP, _b_MktRF _b_HML and _b_SMB for January 1950
Regress eret against b_IP, _b_MktRF _b_HML and _b_SMB for February 1950
....
Software:
Regress eret against b_IP, _b_MktRF _b_HML and _b_SMB for January 1950
Regress eret against b_IP, _b_MktRF _b_HML and _b_SMB for February 1950
....
I tried it with this command:

Code:

asreg eret _b_IP _b_MktRF _b_SMB _b_HML, fmb

Since I don't get nearly the results as in the paper that I am trying to replicate, I think I made a mistake in the last step. Usually in the FMB approach's second step, there is only one excess return value per month. But since I have 49 portfolios, I have 49 values per month. I don't know how to conduct the second FMB step on this.
Comment
Alexander Schmidt

Join Date: Jun 2018

Posts: 25
#8

25 Jun 2018, 03:59

Just a small addition to your comment Devra:
Within an industry, the coefficients [_b_IP, _b_MktRF, _b_HML and _b_SMB] do not differ by month-- they are "constant over time." But they differ across industries, even for the same time (month.)

Any help from you would be really appreciated.
Comment

Announcement

Fama-MacBeth cross-sectional regression with 49 stock portfolios

Comment

Comment

Comment

Comment

Comment

Comment

Comment