Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sorting and averaging portfolios, time-series regressing each decile's return on factors

    Dear Stata Experts,

    Could you please give me hints on how can I implement the following:

    1. I have my data in the following form:

    Code:
    stock    ymdate    excess    mkt    smb    hml    rmw    cma    nobs    r2    r2adj    beta_mkt    beta_smb    beta_hml    beta_rmw    beta_cma    rs_cons
    111111    330        .0385    -.0108    .0071    -.0039    .0163                   .                 
    111111    331    .0287317    .0352    -.0091    -.0093    .0196    -.0163     .                              
    111111    332    -.0397046    -.0259    .0038    .0028    -.0097    .0189    1                                
    111111    333    -.1983077    -.2324    -.0813    .0423    .0171    .0222    2                                
    111111    334    -.0792021    -.0777    .0285    .0308    -.0206    .007      3                                
    111111    335    .0706091    .0681    .0007    -.0445    .0303    -.025       4                                
    ....                     
    111111    364    .0827522    .0842    -.0238    -.0373    .0178    -.0149    33                                
    111111    365    -.0080123    -.0109    .0137    -.0195    -.0123    -.0036    34    .9830043    .97996935    .89905739    -.3561997    -.17716458    -.2862515    -.08569398    -.00008467
    111111    366    -.0030894    -.019    -.0329    -.0002    -.0044    .0309    35    .98301784    .98008988    .89945595    -.3561501    -.1752691    -.28215496    -.08410732    -.00011995
    ....
    wficn ymdate excess mkt smb hml rmw cma nobs r2 r2adj beta_mkt beta_smb beta_hml beta_rmw beta_cma rs_cons
    111112 401 .0031 .001 .0261 -.0093 .0118
    111112 402 -.0134 -.0034 .009 .0325 -.0216 .0202
    ...
    111112 436 .0332269 .0236 .0313 -.0121 .0026 -.0021 34 .78003481 .74075532 1.2317218 .59357388 -.82897375 -.10990499 -.4979433 .00160522
    111112 437 -.0671342 -.0114 -.0347 .0155 .0342 .0109 35 .77233286 .73307991 1.1876516 .51866611 -.81139137 -.20526249 -.58642725 .00172887

    2. On ymdate corresponding to January of each year (1990-2012) I would like to sort stocks based on their rs_cons value on that date (values of which begin from 34th months for each stock) into 10 equal deciles, from highest rs_cons to the lowest. As I see it in Stata: count total number of stocks, for which value rs_cons exists on required ymdate, divide by 10, round up - this is the number of stocks in each decile for each yearly ymdate. Then sort, allocate stocks among deciles.

    3. Calculate equal-weighted average of "excess" among stocks in each decile for each yearly ymdate. Save averages of each deciles as time-series on yearly frequency.
    4. Regress yearly time-series averaged returns of each decile on the two factors (also time-series, with the same frequency).

    Given that Stata can work only with one matrix at a time, I'm a bit lost of how can I sort, average returns, store them and regress on factors for several periods.
    I would appreciate your help very much.

    Best,
    Ekaterina

  • #2
    2. On ymdate corresponding to January of each year (1990-2012) I would like to sort stocks based on their rs_cons value on that date (values of which begin from 34th months for each stock) into 10 equal deciles, from highest rs_cons to the lowest. As I see it in Stata: count total number of stocks, for which value rs_cons exists on required ymdate, divide by 10, round up - this is the number of stocks in each decile for each yearly ymdate. Then sort, allocate stocks among deciles.
    Your query is not clear to me. What I understand is that you want to sort all stocks in each month on the basis of rs_cons variable into 10 groups. If this is true, then you can use astile program that can be downloaded from SSC.
    Code:
    ssc install astile
    bys ymdate: RS_Deciles = rs_cons, nq(10)
    The new variable RS_Deciles will assume values from 1 to 10; 1 for stocks with the smallest rs_cons and 10 for stocks with the highest.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

    Comment


    • #3
      Yes, you're right about what I would like to achieve in 2).
      Could you please say how can I code so that Stata identifies the lowest ymdate in each year for each stock when doing this sorting?
      If I have 22 years, would I have 22 columns of RS_Deciles then?

      Thanks

      Comment


      • #4
        Could anyone help?

        Comment


        • #5
          Hopefully this FAQ will be helpful to you.
          Best regards,

          Marcos

          Comment


          • #6
            Dear all, dear Attaullah,

            I have an error while using your advice, please see below:

            Code:
            forval i=275(12)611 {
                qui gen cons_`i'=1 if ymdate==`i'
            }
            gen valid=max(cons_275,cons_287, cons_299, cons_311, cons_323, cons_335, cons_347, cons_359, cons_371, cons_383, cons_395, cons_407, cons_419, cons_431, cons_443, cons_455, cons_467, cons_479, cons_491, cons_503, cons_515, cons_527, cons_539, cons_551) 
            replace valid=. if rs_cons==.
            keep if valid!=.
            
            replace valid=rs_cons
            ssc install astile
            bys ymdate: RS_Deciles = valid, nq(10)
            }
            I receive: unrecognized command: RS_Deciles
            I first leave only rs_cons for ymdates I need and then want to sort them by ymdate. What am I doing wrong here?
            Thanks a lot.

            Comment


            • #7
              There is an error in the syntax. Change this line
              Code:
               
               bys ymdate: RS_Deciles = valid, nq(10)
              to this line
              Code:
                
               bys ymdate:astile RS_Deciles = valid, nq(10)
              Regards
              --------------------------------------------------
              Attaullah Shah, PhD.
              Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
              FinTechProfessor.com
              https://asdocx.com
              Check out my asdoc program, which sends outputs to MS Word.
              For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

              Comment


              • #8
                Thanks, it worked.
                Do you know how can I find the row number of particular variable? For example, which command would help me to show row number, in which ymdate equals to 372?
                Thanks!

                Comment


                • #9
                  If you don't have a sort "id" variable, you may type this:

                  Code:
                  . gen id = _n
                  Being sorted, it will provide the same number of the row.

                  Hope that helps.
                  Best regards,

                  Marcos

                  Comment


                  • #10
                    Code:
                    l if ymdate == 312
                    l ymdate if ymdate == 312
                    will show row (in Stata we always say observation) numbers. Note that a given value could occur in one or more observations.

                    Comment


                    • #11
                      Dear all,

                      I did not think that a given value could indeed occur in more than one observations (for different stocks). Can I somehow add a condition to find ym value of 312 only among the stocks corresponding to ym value?

                      For example:
                      stock ymdate excess mkt smb hml mom RS_Deciles ym_new
                      111111 632 .021669 .0273 .0048 .0157 -.0103 1 633
                      111111 342 -.0130782 -.0125 -.0021 .0227 .0063 1 343
                      111111 596 .0342064 .0408 .0249 .0097 -.05 2 597
                      111112 527 .0410009 .0429 -.0305 .0235 -.0569 5 528

                      I would like a new column (variable) that shows corresponding "excess" value for indicated ym_new and stock - for 1st row - find a value for "excess", for which ymdate=ym_new and stock=11111. Could you please help me to achieve it?

                      Thanks a lot.

                      Comment


                      • #12

                        Can I somehow add a condition to find
                        You may use - list - plus the "if" clause.
                        Best regards,

                        Marcos

                        Comment


                        • #13
                          Dear Marcos,
                          could you elaborate more on that please? How can I indicate "among the same stock only" in list if?

                          Comment


                          • #14
                            Ekaterina: Your questions aren't very clear. Part of this is presumably because your first language is not English but part is because you are making limited use of CODE delimiters or other ways of showing data or code.

                            I can only guess that you are asking for

                            Code:
                            list excess if ymdate==ym_new & stock==11111
                            which is an extension of principles already shown in #10 and is an application of if, which is precisely what Marcos is advising in #12.

                            Comment


                            • #15
                              Ok, I try to rephrase

                              I have the following dataset:
                              Code:
                              stock ymdate excess mkt smb hml mom RS_Deciles ym_new
                              111111 632 .021669 .0273 .0048 .0157 -.0103 1 633
                              111111 342 -.0130782 -.0125 -.0021 .0227 .0063 1 343
                              111111 596 .0342064 .0408 .0249 .0097 -.05 2 597
                              111112 527 .0410009 .0429 -.0305 .0235 -.0569 5 528
                              I would like to create a new variable (a new column) called "returns". It would equal to the value from "excess" column, for which ymdate=ym_new, but searching for it only among the stocks indicated in column "stock". For example, row 1 of a new variable would look through all excess values of stock 111111 and find "excess" value, for which ymdate=633. Is that possible?

                              Comment

                              Working...
                              X