Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ASGEN : A Stata module for weighted average mean

    Thanks to Kit Baum, I have shared a new program asgen on SSC that computes weighted average mean. The program can be installed by
    Code:
    ssc install asgen
    Following are related details:


    Syntax

    [bysort varlist]: asgen newvar = exp [if] [in] [, weights(varname) by(varlist)]


    Description

    asgen creates a new variable from an existing variable or an expression. The new variable contains weighted average mean. The existing variable is supplied to asgen by the =exp. The weights are based on the values of an existing numeric variable, that is specified in the weights(varname) option.


    asgen accepts both the [if] [in] qualifiers for performing the required calculations on a subset of data. Further, the use of =exp can come handy in many situations where we want to make changes on the fly before we find the weighted average mean. For example, we might want to multiply the value of 100 to a variable before we find its weighted average. In such a case, the expression will look like:

    Code:
     asgen wmX = X * 100, w(Y)

    where wmX is a new variable to be created and shall contain the weighted average; X and Y are existing variables in the data set. Options

    weight(varname) is an optional option. Therefore, without this option, asgen works like egen command and finds simple mean


    Example 1: Weighted average mean for kstock using the variable mvalue as a weight

    Code:
    webuse grunfeld
     asgen WM_kstock = kstock, w(mvalue)

    Example 2: Weighted average mean using an expression

    .
    Code:
    asgen WM_kstock = kstock / invest, w(mvalue)
    This will divide the variable kstock on the variable invest before finding the weighted average mean.

    Example 3: Avoiding the use of [if] using an expression

    We use the [if] qualifier to perform calculation on a subset of the data. Using [if] condition, the resulting variable will have missing values where the condition is not true. There might be circumstance where we want to use [if] without encountering missing values in the new variable. With asgen, the use of an expression can come very handy in such a s situation. To proceed with an example, let us use astile [can be downloaded from ssc by: ssc install astile] to make five groups of firms based on the variable mvalue and five groups based on kstock.

    Code:
    astile size5 = mvalue, nq(5)
    astile kstock5 = kstock, nq(5)


    Now, let's find weighted average mean in each year for a combination of firms that are in the first quantile of mvalue (size == 1) and third quantile of kstock (kstock5 == 3) using mvalue as a weighting variable


    Code:
    bys year: asgen WM_invest = (kstock / (size5 ==1 & kstock5==3)), w(mvalue)
    Off-course, we could have done that using [if] qualifier, but that will generate missing values where the condition is not true

    Code:
    bys year: asgen WM_invest_IF = kstock if size5 ==1 & kstock5==3 , w(mvalue)
    sort id year
    The difference between WM_invest and WM_invest_IF is that the former spreads the results within years, and the later does not. The use of this trick is borrowed from the behavior of egen. Further details related to this trick can be read here in Nick Cox's column.
    *

    Comments for improvement of asgen are greatly appreciated.
    Last edited by Attaullah Shah; 10 Oct 2017, 22:06.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    www.FinTechProfessor.com
    If you use MS Word, do check my asdoc program that easily sends Stata output to MS Word

  • #2
    Dear Attaullah,

    Asgen has been really great additional STATA module. I was wondering though if it is possible to create equal-weighted portfolios through the use of asgen? Thank you in advance.

    Best regards,
    Kate

    Comment


    • #3
      If you do not use the weight option, asgen then uses equal weights, just like egen function.
      Regards
      --------------------------------------------------
      Attaullah Shah, PhD.
      Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
      www.FinTechProfessor.com
      If you use MS Word, do check my asdoc program that easily sends Stata output to MS Word

      Comment


      • #4
        I was wondering if asgen can be used to derive the long/short portfolio average excess returns, in which I would need to explicitly spread the means because there will be missing values for the other portfolio due to the portfolio sort.

        I gathered this code from other posts, as well as my own but I don't know how to actually make it work.
        bys ymdate : asgen ExUSD = F_Excess_USD_w(cond(IVOL_w_5 == 1, F_Excess_USD_w, .)), w(MV_USD_w)

        The F_Excess_USD_w before the parenthesis was mean, and I tried to replace it, but it doesnt work...

        Would really appreciate your feedback! Thank you!

        Comment


        • #5
          If you want to use if qualifier type statement and still do nto want have missing values where the condition is not met, then you can use this trick

          Code:
          bys ymdate : asgen ExUSD = (F_Excess_USD_w / (IVOL_w_5 == 1)), w(MV_USD_w)
          The help file of asgen provides the following material

          Example 3: Avoiding the use of [if] using an expression

          We use the [if] qualifier to perform calculations on a subset of the data. Using [if] condition, the resulting variable will have missing values where the condition is not true.
          There might be circumstances where we want to use [if] without encountering missing values in the new variable. The use of expression can come very handy in such as situation.
          To proceed with an example, let us use astile [can be downloaded from ssc by clicking here install astile] to make five groups of firms based on mvalue and five groups based
          kstock.

          Code:
          astile size5 = mvalue, nq(5)
              . astile kstock5 = kstock, nq(5)

          Now, let's find weighted average mean in each year for a combination of firms that are in the first quantile of mvalue (size == 1) and third quantile of kstock (kstock5 == 3)
          using mvalue as a weighting variable


          Code:
              . bys year: asgen WM_invest = (kstock / (size5 ==1 & kstock5==3)), w(mvalue)
          Off-course, we could have done that using [if] qualifier, but that will generate missing values where the condition is not true

          Code:
              . bys year: asgen WM_invest_IF = kstock if size5 ==1 & kstock5==3 , w(mvalue)
          The difference between WM_invest and WM_invest_IF is that the former spreads the results within years, and the later does not. The use of this trick is borrowed from the
          behavior of egen. Further details related to this trick can be read here https://ideas.repec.org/a/tsj/stataj...2p305-314.html in Nick Cox's column.
          *

          Regards
          --------------------------------------------------
          Attaullah Shah, PhD.
          Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
          www.FinTechProfessor.com
          If you use MS Word, do check my asdoc program that easily sends Stata output to MS Word

          Comment


          • #6
            Would this work too?

            Code:
            bys year : asgen WM_Invest_IF =  cond(size5 == 1 & kstock5 == 3, kstock .),  w(mvalue) 
            This and the other trick are discussed in Sections 9 and 10 of the paer cited in #5.

            Comment


            • #7
              Thank you to you both! It worked perfectly. I very much appreciate your help

              Comment


              • #8
                https://journals.sagepub.com/doi/pdf...867X1101100210 takes you straight to a .pdf (public access) of the paper mentioned in #5.

                Comment


                • #9
                  Originally posted by Attaullah Shah View Post
                  If you do not use the weight option, asgen then uses equal weights, just like egen function.
                  I have tried this, but the result is actually the same in both cases. Would I have to add anything to the code to make it work for equally-weighted average excess returns?

                  bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1)), w(MV_USD_w)
                  bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5)), w(MV_USD_w)
                  bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                  and

                  bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1))
                  bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5))
                  bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                  Thank you kindly.

                  Comment


                  • #10
                    Originally posted by Kate Lussy View Post

                    I have tried this, but the result is actually the same in both cases. Would I have to add anything to the code to make it work for equally-weighted average excess returns?



                    bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1)), w(MV_USD_w)
                    bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5)), w(MV_USD_w)
                    bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                    and

                    bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1))
                    bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5))
                    bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                    Thank you kindly.
                    I want to add upon my post. This was also the case when I used _gwtmean.

                    My procedure is to:
                    1. Create portfolios
                    2. Limit each portfolio to one observation per month
                    3. To generate the value-weighted excess returns for each portfolio
                    4. To find t-statistics

                    I am wondering if it makes sense to move 2. to after 3.? If I do so, I then generate different values for the for equal and value weighting. I tried to look at the data to check if it is right, but I cannot make sense of it. So, it would be great to hear someone's feedback!


                    *Generate Future Excess Returns
                    tsset DSCode1 ymdate
                    gen F_Excess_USD_w=F.Excess_USD_w
                    drop if F_Excess_USD_w==.

                    *Univariate Portfolio Analysis
                    * Portfolio sorting according to IVOL
                    ssc install astile
                    astile IVOL_w_5 = IVOL_w, nq(5)

                    sort ymdate IVOL_w_5 DSCode1
                    by ymdate IVOL_w_5: drop if _n >1


                    * Value-weighted average excess returns of each portfolio in each month (ymdate)
                    bys ymdate: asgen ExUSD_1_VW = (F_Excess_USD_w / (IVOL_w_5 == 1)), w(MV_USD_w)

                    *Equal-weighted average excess returns of each portfolio in each month (ymdate)
                    bys ymdate: asgen ExUSD_1_EW = (F_Excess_USD_w / (IVOL_w_5 == 1))

                    sort ymdate IVOL_w_5 DSCode1
                    by ymdate IVOL_w_5: drop if _n >1


                    tsset DSCode1 ymdate
                    sort DSCode1 ymdate

                    *Value-weighted average excess returns & t-statistics of each portfolio
                    newey ExUSD_1_VW, lag(6) force
                    newey ExUSD_1_EW, lag(6) force

                    [CODE]
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input long DSCode1 float(ymdate MV_USD_w F_Excess_USD_w IVOL_w_5 ExUSD_1_EW ExUSD_1_VW)
                    1 559 151.45 .07484736 1 .020173784 .034087826
                    1 560 163.49 .0005981706 4 .006313993 .003041933
                    1 561 164.27 .012566957 4 .005149428 -.01948735
                    1 562 329.46 .019797264 3 .0251439 .0203678
                    1 563 337.62 .05277631 3 .03560889 .02851063
                    1 564 356.64 .07170374 3 -.005529334 -.03746695
                    1 565 383.81 .03556859 4 .016249549 .015531148
                    1 566 399.09 -.026278155 3 .033919852 .05264597
                    1 567 390.26 -.012818905 3 .011069474 -.005305402
                    1 568 380.41 -.012739185 3 -.009749626 -.0354712
                    1 569 377.02 -.04112721 3 -.00739424 -.04183678
                    1 570 363.43 -.04535 3 -.018767476 -.009455224
                    1 571 348.14 .02038562 3 -.012077605 -.005213041
                    1 572 356.64 -.07103478 3 .008945227 -.01197008
                    1 573 461.65 -.1154449 3 -.03793382 -.025115095
                    1 574 459.08 .0089275865 3 -.0042141736 -.04437515
                    1 575 464.68 -.2218975 2 -.03883289 -.09056226
                    1 576 362.99 .1751152 3 -.003365763 -.04764247
                    1 577 427.05 -.0969147 4 -.012286625 -.033523023
                    1 578 386.48 -.03043021 4 .017590337 .0431491
                    1 579 375 .0028491446 4 .013230603 .02295336
                    1 580 376.34 -.083706 4 -.025947643 -.045251
                    1 581 341.64 -.0954448 3 -.006803245 .035611454


                    Last edited by Kate Lussy; 09 May 2019, 02:02.

                    Comment

                    Working...
                    X