How to double-sort quartile portfolio?

Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#31

09 Dec 2017, 14:15

Code:

gen H_L= . replace H_L = 1 if P1==1 & double_P2 == 1 replace H_L = 2 if P1==1 & double_P2 == 4 qui ttest alpha, by(H_L) unequal scalar t_D1 = r(t) scalar p_D1 = r(p)

Yes, that should work, and it also allows you to deal with heteroscedasticity among the 16 groups (though you could accomplish that by specifying -vce(robust)- in the regression command in #25. The only draw back to your code is that it's long and you will have to either repeat it 8 times, or write a loop to do it. The two approaches will give similar, though not identical, results. Matter of taste, I think.

For instance, to obtain the Bonferroni adjusted p-value, multiply the uncorrected p-value by the total number of comparisons, if so, do the nominal p-values be corrected?

Yes.

Code:

margins P1#double_P2, pwcompare(effects) mcompare(bonferroni)

will not work for you because it will do the correction for the full 120 pairwise comparisons that -margins- generates. That would be an overcorrection, because you are only doing 8 comparisons.
Comment
Jae Li

Join Date: May 2017

Posts: 184
#32

10 Dec 2017, 06:33

Clyde Schechter Thank you for your confirmation! It's a great learning!

Comparing the results of both methods, I find they generated similar absolute values of t-statistics but one is positive and the other is negative. Do you know why does it happen like that?

Moreover, since you said there is no need to add -mcompare(bonferroni)- option, do I just need to multiply the obtained p-values by 8 comparisons in order to have it corrected?

Many thanks for your help indeed!

Last edited by Jae Li; 10 Dec 2017, 07:02.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#33

10 Dec 2017, 09:53

Comparing the results of both methods, I find they generated similar absolute values of t-statistics but one is positive and the other is negative. Do you know why does it happen like that?

It's just the order of the categories being compared. In one case you're looking at a test of the difference A minus B and in the other you're looking at B minus A.

Moreover, since you said there is no need to add -mcompare(bonferroni)- option, do I just need to multiply the obtained p-values by 8 comparisons in order to have it corrected?

Yes. If the multiplied p-value exceeds 1, of course you just stay with 1. Alternatively, you can call your results statistically significant if the original p < 0.05/8 (=0.00625) instead of comparing to 0.05 (assuming you were using the 0.05 significance level).
Comment
Jae Li

Join Date: May 2017

Posts: 184
#34

12 Dec 2017, 05:21

@Clyde Schechter Thank you very much indeed, Clyde! All is clear and you are always a great helper! Merry Xmas!
Comment

Jae Li

Join Date: May 2017
Posts: 184

#35

12 Dec 2017, 16:15

@Clyde Schechter Hi Clyde, can I ask you a question related to the post #25?

Code:

capture program drop myregress
program define myregress
    regress excess_returns mktrf_m
    gen alpha = _b[_cons]
    gen se_cons = _se[_cons]
    exit
end
 
runby myregress, by(P1 double_P2)
table P1 double_P2, c(mean alpha mean se_cons)
regress alpha i.P1##i.double_P2
regress, coeflegend
margins P1#double_P2, pwcompare(effects)

When I use the above codes, it shows an almost empty result like this:

Code:

. regress alpha    i.P1##i.double_P2

Source    SS           df    MS    Number of obs   =    601,299
            F(15, 601283)   =    .
Model    8.02136077        15    .534757385    Prob > F        =    .
Residual    0   601,283    0    R-squared       =    1.0000
            Adj R-squared   =    1.0000
Total    8.02136077   601,298    .00001334    Root MSE        =    0

                
alpha    Coef.   Std. Err.    t    P>t     [95% Conf.    Interval]
                
P1
2    .002405          .    .    .            .    .
3    .0054403          .    .    .            .    .
4    .0063755          .    .    .            .    .

double_P2
2    .0028985          .    .    .            .    .
3    .0073139          .    .    .            .    .
4    .0101489          .    .    .            .    .

P1#double_P2
2 2    .0007555          .    .    .            .    .
2 3    .0013669          .    .    .            .    .
2 4    -.0011102          .    .    .            .    .
3 2    .0023717          .    .    .            .    .
3 3    -.0011097          .    .    .            .    .
3 4    -.0021836          .    .    .            .    .
4 2    -.0012774          .    .    .            .    .
4 3    -.0048721          .    .    .            .    .
4 4    -.0063355          .    .    .            .    .

_cons    -.0069441          .    .    .            .    .

If I add -(vce) robust after regress, it generates a weird result as well:

Code:

. regress alpha i.P1##i.double_P2, vce(robust)

Linear regression    Number of obs     =    601,299
    F(0, 601283)      =    .
    Prob > F          =    .
    R-squared         =    1.0000
    Root MSE          =    0

        
Robust
alpha       Coef.   Std. Err.      t    P>t     [95% Conf.    Interval]
        
P1
2      .002405   4.99e-16  4.8e+12    0.000      .002405    .002405
3     .0054403   5.27e-16  1.0e+13    0.000     .0054403    .0054403
4     .0063755   4.99e-16  1.3e+13    0.000     .0063755    .0063755

double_P2
2     .0028985   4.99e-16  5.8e+12    0.000     .0028985    .0028985
3     .0073139   5.00e-16  1.5e+13    0.000     .0073139    .0073139
4     .0101489   5.70e-16  1.8e+13    0.000     .0101489    .0101489

P1#double_P2
2 2     .0007555   4.99e-16  1.5e+12    0.000     .0007555    .0007555
2 3     .0013669   5.00e-16  2.7e+12    0.000     .0013669    .0013669
2 4    -.0011102   5.70e-16 -1.9e+12    0.000    -.0011102    -.0011102
3 2     .0023717   5.27e-16  4.5e+12    0.000     .0023717    .0023717
3 3    -.0011097   5.29e-16 -2.1e+12    0.000    -.0011097    -.0011097
3 4    -.0021836   5.95e-16 -3.7e+12    0.000    -.0021836    -.0021836
4 2    -.0012774   4.99e-16 -2.6e+12    0.000    -.0012774    -.0012774
4 3    -.0048721   5.00e-16 -9.7e+12    0.000    -.0048721    -.0048721
4 4    -.0063355   5.70e-16 -1.1e+13    0.000    -.0063355    -.0063355

_cons   -.0069441   4.99e-16 -1.4e+13    0.000    -.0069441    -.0069441

The partial dataset is provided in the post #24 for your review.
Do you possibly know where is wrong? Many thanks for your help in advance!

Last edited by Jae Li; 12 Dec 2017, 16:53.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#36

12 Dec 2017, 18:14

Take a look at the -regress- output in #35. The R² = 1, and the residual sum of squares is zero. So what this means is that you have an exact perfect linear relationship between your variable alpha and all of the combinations of P1 and double_P2 in your data. That's not surprising considering that the variable alpha was created using -runby, by(P1 double_P2)-. That means that -runby- creates a single value of alpha for every combination of P1 and double_P2, and that value of alpha appears in all observations with those values of P1 and double_P2. So even though you have many observations per P1 # double_P2 combination, the value of alpha is the same in all of those. So the regression is a perfect fit and there are no standard errors.
Comment
Jae Li

Join Date: May 2017

Posts: 184
#37

13 Dec 2017, 04:08

@Clyde Schechter Thank you for your reply!

That means that -runby- creates a single value of alpha for every combination of P1 and double_P2, and that value of alpha appears in all observations with those values of P1 and double_P2. So even though you have many observations per P1 # double_P2 combination, the value of alpha is the same in all of those. So the regression is a perfect fit and there are no standard errors.

May I ask that is it normal to have a result like this?

As it's a perfect positive linear relationship and no standard errors, there will be no t-test statistics and p-values, right? But, if I add -(vce) robust- option in the -regress-, there will be really small standard errors and extremely large t-test statistics. Can I possibly use them as the evidence to test the statistical significance? Many thanks for your help!

Last edited by Jae Li; 13 Dec 2017, 04:52.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#38

13 Dec 2017, 08:18

May I ask that is it normal to have a result like this?

Well, it is normal, and inevitable, in a situation where you have defined the variable alpha in terms of P1 and double_P2 only, and then regress it on all combination fo P1 and double_P2. By the way you have constructed alpha, there is no variation within any combination of P1 and double_P2, so the regression could not possibly give any other result.

As it's a perfect positive linear relationship and no standard errors, there will be no t-test statistics and p-values, right?

Correct.

But, if I add -(vce) robust- option in the -regress-, there will be really small standard errors and extremely large t-test statistics. Can I possibly use them as the evidence to test the statistical significance?

No. What you are seeing in those standard errors are rounding errors. These "robust" standard errors are not standard errors at all. There is no residual variation in the model; the very concept of t-tests and p-values is inapplicable here.

I'm not involved in finance at all, so I don't know what the overall thrust of your project is here. (And it's pointless to try to explain it to me because I really just don't get these things.) But I have the general sense that you are trying to do a kind of study whereby you partition a bunch of assets into portfolios based on some characteristics of the assets, and then you contrast the performance of those portfolios, then making inference about whether those characteristics are good predictors of subsequent performance. In your case, you have made a single estimate of each portfolio as a whole, which is why you are facing an invariant alpha and perfect prediction and no possibility for statistical tests. Perhaps what you meant to do was characterize the performance of each asset within each portfolio and then you would have a separate alpha for each asset, and then there would be variation within portfolios? That would require a slightly different approach back at the stage where you calculate alpha: the -by()- option of -runby- would need to include some variable(s) that identifies(y) the individual assets within the portfolios.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment