Basically, what I want to do is follow Daniel et al( 1997), to do a 5*5 * triple sort (dependent sorting) in CRSP common stock universe. I want to place stocks into a 3-dimensional characteristics space. I want to use a 5*5*5 sorting procedure to classify every potential stock position into quintiles according to three characteristics : market capitalization (ME), book-to-market ratio (BM) and past stock return momentum (Mom).
yrm: date variable, e.g. 1998m1,1998m2...
exchcd: exchange code, = 1 if the stock is listed on NYSE (New York Stock Exchange)
Step 1: to achieve the first sorting, i.e. sort all stocks (NYSE listed and Non-NYSE listed) based on the NYSE market equity breakpoints;
My codes are as follows:
forvalues i = 20(20)80{
qui bysort yrm: egen ME_Brkp`i' = pctile(ME) if exchcd == 1, p(`i')
}
// so obviously, for many non-NYSE listed stocks, has the missing values for all ME breakpoints
//need to assgin these NYSE based breakpoints to those non-NYSE ones too
forvalues i = 20(20)80{
qui by yrm (permno), sort: replace ME_Brkp`i' = ME_Brkp`i'[_n-1] if ME_Brkp`i' >= .
}
if ME_Brkp20 !=. & ME_Brkp40 !=. & ME_Brkp60 !=. & ME_Brkp80 !=. {
qui {
g ME_rank = 1 if ME <= ME_Brkp20 // Smallest Stocks
replace ME_rank = 2 if ME > ME_Brkp20 & ME <= ME_Brkp40
replace ME_rank = 3 if ME > ME_Brkp40 & ME <= ME_Brkp60
replace ME_rank = 4 if ME > ME_Brkp60 & ME <= ME_Brkp80
replace ME_rank = 5 if ME > ME_Brkp80 & ME!=. //Big stocks
}
}
Step 2: Each quintile portfolio is further subdivided into book-to-market quintiles
forval r = 1(1)5{
forvalues i = 20(20)80 {
qui bysort yrm ME_rank: egen BM_p`i'_`r' = pctile(cond(ME_rank==`r',BM,.)), p(`i')
}
}
g BM_rank = .
forval r=1(1)5{
qui{
replace BM_rank = 1 if BM<=BM_p20_`r' & ME_rank== `r'
replace BM_rank = 2 if BM>BM_p20_`r' & BM<=BM_p40_`r' & ME_rank== `r'
replace BM_rank = 3 if BM>BM_p40_`r' & BM<=BM_p60_`r' & ME_rank== `r'
replace BM_rank = 4 if BM>BM_p60_`r' & BM<=BM_p80_`r' & ME_rank== `r'
replace BM_rank = 5 if BM>BM_p80_`r' & ME_rank== `r' & BM!=.
}
}
Step 3: each of the resulting 25 fractile portfolios are further subdivided into quintiles based on the 12-month past returns of stocks
forval mer = 1(1)5{
forval bmr = 1(1)5{
forvalues i = 20(20)80 {
qui bysort yrm ME_rank BM_rank: egen Mom_p`i'_`mer'_`bmr' = pctile(cond(ME_rank==`mer',BM_rank==`bmr',Mom,.)), p(`i')
}
}
}
g Mom_rank =.
forval mer=1(1)5{
forval bmr=1(1)5{
qui{
replace Mom_rank = 1 if Mom<=Mom_p20_`mer'_`bmr' & ME_rank== `mer' & BM_rank== `bmr'
replace Mom_rank = 2 if Mom>Mom_p20_`mer'_`bmr' & Mom<=Mom_p40_`mer'_`bmr' & ME_rank== `mer' & BM_rank== `bmr'
replace Mom_rank = 3 if Mom>Mom_p40_`mer'_`bmr' & Mom<=Mom_p60_`mer'_`bmr' & ME_rank== `mer' & BM_rank== `bmr'
replace Mom_rank = 4 if Mom>Mom_p60_`mer'_`bmr' & Mom<=Mom_p80_`mer'_`bmr' & ME_rank== `mer' & BM_rank== `bmr'
replace Mom_rank = 5 if Mom>Mom_p80_`mer'_`bmr' & ME_rank== `mer' & BM_rank== `bmr' & Mom!=.
}
}
}
Can you identify if there is something wrong with the above codes?
Since when I tab the resulting variables:
tab ME_rank
tab BM_rank
tab Mom_rank
I got very confusing and suspicious results for the rank based on Mom (Momentum, the past 12 month return):
tab Mom_rank
Mom_rank | Freq. Percent Cum.
------------+-----------------------------------
1 | 1,912,688 93.12 93.12
5 | 141,391 6.88 100.00
------------+-----------------------------------
Total | 2,054,079 100.00
some observations should have Mom_rank 2,3, or 4 values. Why does Mom_rank only take two values: 1 and 5? And over 90% of them hold value 1?
Comment