Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Clyde and Nick,

    Here is a (slightly simplified) note on the methodology which is a standard technique in empirical finance.

    People often sort common stocks into, say, quintiles based on a characteristic like the book-to-market ratio or firm size. They then compute the monthly return (i.e. relative change in value) for each of the five portfolios (or "groups of stocks") and compare the (risk-adjusted) mean return of the top-quintile portfolio with that of the bottom-quintile portfolio. The goal of this procedure is to discover a firm characteristic which is able to choose stocks which have, on average, better returns than the overall stock market.

    Best,
    Daniel

    Comment


    • #17
      Hi Nick, Clyde

      I understand your viewpoint and concerns.

      Testing statistical significance of q5-q1 factor returns is a standard procedure in the finance literature (for e.g.studies on factor models such as Fama French, Carhart, Sharpe etc).

      Best,
      John.

      Comment


      • #18
        Daniel,

        This sounds different from what John Abe was asking for. I understood him to be looking at testing the means of the top and bottom quintiles of the returns distribution for a significant difference, which is not a meaningful use of statistical testing. You are describing testing the mean returns in the top and bottom quintiles of the distribution of some other variable. That would be meaningful. Perhaps John will clarify what he actually means.

        Added: Crossed with #17.

        Comment


        • #19
          Thanks Daniel for your comments.

          I meant the "risk adjusted" factor return (after accounting for other firm characteristics) and then testing whether the residual Q1-Q5 quintile returns are significantly different. If the difference is not statistically significant then the factor is not useful in identifying securities that on average have better return profile.

          Best,
          John.

          Comment


          • #20
            But John, you have not addressed the question I posed in #18. What are Q1 and Q5 the quintiles of? Are they quintiles of the risk adjusted factor return itself, or of some other variable?

            Comment


            • #21
              Hi Clyde,

              The Q1 and Q5 returns are of the residual after taking out the effects of the market and other firm characteristics. In this case the distribution of the original factor and the residual are different, so it is technically some other variable.

              Best,
              John.

              Comment


              • #22
                OK, I don't understand the financial jargon here, and so I can't figure out which variables in your example data are which. So I'll put it abstractly and then you can plug in the actual variable names. Let's call the residual after taking out the effects... variable residual. And let's call the other thing factor. It's not just "technically" another variable--it has to actually be another variable for this to give you something other than garbage. And let's call your date variable date. Substitute the actual variable names in this code.

                Code:
                egen quintile = xtile(factor), by(date) nq(5)
                levelsof date, local(dates)
                foreach d of local dates {
                    display "Date: " %td `d'
                    ttest residual if inlist(quintile, 1, 5) & date == `d', by(quintile)
                }
                Notes:
                1. Requires the -egenmore- package so that you can use the -xtile()- -egen- function.
                2. I assume you want to test equality of the quintiles of residual separately at each date. That is what this code does.
                3. The mean residuals in the 1st and 5th quintiles will be part of the output of the -ttest- command. They will not be saved in the data set.
                4. If you need those means saved in the data, I would use a slightly different approach; post back if that's what you want.

                Comment


                • #23
                  Hi Clyde,

                  Thank you. I think it would be important to save the 1st and 5th quintile residual means if possible.

                  Best,
                  John.

                  Comment


                  • #24
                    So the following code creates a toy data set with the relevant variables (date, residual, factor) and then does the calculations. At the end the dataset will contain the following additional variables:

                    quintile -- the quintile of factor into which this asset falls on this date

                    quintile_mean_residual -- the mean value of residual in this asset's quintile of factor from this date. You may not care about its values for quintiles other than 1 and 5. Feel free to disregard the results for quintiles 2-4 if you wish.

                    t_stat: the t-statistic for a t-test of equality quintile_mean_residual in quintiles 1 and 5 for this date

                    p-value: the p-value of that t-test

                    Code:
                    clear*
                    
                    capture program drop myprogram
                    program define myprogram
                        regress residual i.quintile
                        predict quintile_mean_residual
                        test 5.quintile = 1.quintile
                        gen t_stat = sqrt(r(F))
                        gen p_value = r(p)
                        exit
                    end
                    
                        
                    
                    //    CREATE DEMONSTRATION DATA SET
                    set seed 1234
                    set obs 100
                    gen asset = _n
                    expand 10
                    by asset, sort: gen date = mdy(1, 1, 2000+_n-1)
                    format date %td
                    gen residual = rnormal()
                    gen factor = rgamma(10, 4)
                    
                    //    SEPARATE ASSETS INTO QUINTILES BY LEVELS OF FACTOR AT EACH DATE
                    egen byte quintile = xtile(factor), by(date) nq(5)
                    
                    //    CALCULATE MEAN RESIDUALS IN EACH QUINTILE AND
                    //    TEST EQUALITY OF 1ST AND 5TH QUINTILES AT EACH DATE
                    runby myprogram, by(date)
                    Notes: 1. You need to get the -runby- program for this. It was written by Robert Picard and me, and it is available at SSC.
                    2. Evidently, you will want to run this with your actual data. So you will need to skip the part where a demonstration data set is created, and you will need to change the variable names in the code to correspond to the actual variable names for date, residual, and factor in your data everywhere they appear above.
                    3. As there may be complications in your real data not foreseen as I wrote this code, I suggest that you first test the code on a small sample of your data. When testing it, add the -verbose- option to the -runby- command so that if errors crop up, you will be able to see what's happening and not just be left mystified by the absence of results. If there are no problems encountered, or after you fix those you do find, then run it using the entire data set and, if your data set is large, remove the -verbose- option (unless you want to see the gory details for every single date.)
                    Last edited by Clyde Schechter; 16 Oct 2017, 16:43.

                    Comment


                    • #25
                      Originally posted by Nick Cox View Post
                      That makes sense. Each of the variables will have missing values for the other quintile bins, unless you explicitly spread the means.

                      Code:
                      xtile q_R3 = R3, nq(5)
                      bysort newdate : egen R3_q5 = mean(cond(q_R3 == 5, ex_ret, .) )
                      by newdate : egen R3_q1 = mean(cond(q_R3 == 1, ex_ret, .))
                      See e.g. Section 9 of http://www.stata-journal.com/sjpdf.h...iclenum=dm0055

                      (Please use CODE delimiters for code, as requested)
                      Hello Nick,

                      Thank you for the code, it's extremely helpful since I was also struggling with the missing values.

                      I was wondering how to consider value-weighted excess return, ex_ret in the code? Thank you in advance for your help!

                      Best,
                      Kate

                      Comment


                      • #26
                        #25 Sorry, but I don't understand your question. ex_ret is in the code you cite.

                        Comment


                        • #27
                          I apologise for that. I meant, how would the code change if I want to generate the value-weighted excess returns for the particular quintile portfolio. I want to value-weight with MV, or market value.
                          Thank you!

                          Comment


                          • #28
                            Originally posted by Nick Cox View Post
                            #25 Sorry, but I don't understand your question. ex_ret is in the code you cite.
                            Hi Nick,

                            Maybe as a follow-up: instead of trying to use the value-weighted function in egen, I tried to use asgen. However, in asgen, I am not sure how to account for the missing values as you did with the cond function.

                            I thought the code could be:

                            bys ymdate : asgen ExUSD = F_Excess_USD_w(cond(IVOL_w_5 == 1, F_Excess_USD_w, .)), w(MV_USD_w)

                            But, then I get an error message: unknown function F_Excess_USD_w(). I replaced mean, as in your example, with the variable of interest because with asgen, I did not need the mean function. However, I don't know what to put in its place.

                            Thank you in advance. I very much appreciate your help.

                            Comment


                            • #29
                              Sorry, but I didn’t write asgen and have never used it. But my wild guess is that your syntax is a long way from what it supports. Attaullah Shah will no doubt address this.

                              Comment


                              • #30
                                Originally posted by Nick Cox View Post
                                Sorry, but I didn’t write asgen and have never used it. But my wild guess is that your syntax is a long way from what it supports. Attaullah Shah will no doubt address this.
                                I completely understand that. In that case, would you have a recommendation as to how to account for value weighting in your code:
                                bysort newdate : egen R3_q5 = mean(cond(q_R3 == 5, ex_ret, .) ) by newdate : egen R3_q1 = mean(cond(q_R3 == 1, ex_ret, .)) Thank you kindly. I very much appreciate it.

                                Comment

                                Working...
                                X