Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ASGEN : A Stata module for weighted average mean

    Thanks to Kit Baum, I have shared a new program asgen on SSC that computes weighted average mean. The program can be installed by
    Code:
    ssc install asgen
    Following are related details:


    Syntax

    [bysort varlist]: asgen newvar = exp [if] [in] [, weights(varname) by(varlist)]


    Description

    asgen creates a new variable from an existing variable or an expression. The new variable contains weighted average mean. The existing variable is supplied to asgen by the =exp. The weights are based on the values of an existing numeric variable, that is specified in the weights(varname) option.


    asgen accepts both the [if] [in] qualifiers for performing the required calculations on a subset of data. Further, the use of =exp can come handy in many situations where we want to make changes on the fly before we find the weighted average mean. For example, we might want to multiply the value of 100 to a variable before we find its weighted average. In such a case, the expression will look like:

    Code:
     asgen wmX = X * 100, w(Y)

    where wmX is a new variable to be created and shall contain the weighted average; X and Y are existing variables in the data set. Options

    weight(varname) is an optional option. Therefore, without this option, asgen works like egen command and finds simple mean


    Example 1: Weighted average mean for kstock using the variable mvalue as a weight

    Code:
    webuse grunfeld
     asgen WM_kstock = kstock, w(mvalue)

    Example 2: Weighted average mean using an expression

    .
    Code:
    asgen WM_kstock = kstock / invest, w(mvalue)
    This will divide the variable kstock on the variable invest before finding the weighted average mean.

    Example 3: Avoiding the use of [if] using an expression

    We use the [if] qualifier to perform calculation on a subset of the data. Using [if] condition, the resulting variable will have missing values where the condition is not true. There might be circumstance where we want to use [if] without encountering missing values in the new variable. With asgen, the use of an expression can come very handy in such a s situation. To proceed with an example, let us use astile [can be downloaded from ssc by: ssc install astile] to make five groups of firms based on the variable mvalue and five groups based on kstock.

    Code:
    astile size5 = mvalue, nq(5)
    astile kstock5 = kstock, nq(5)


    Now, let's find weighted average mean in each year for a combination of firms that are in the first quantile of mvalue (size == 1) and third quantile of kstock (kstock5 == 3) using mvalue as a weighting variable


    Code:
    bys year: asgen WM_invest = (kstock / (size5 ==1 & kstock5==3)), w(mvalue)
    Off-course, we could have done that using [if] qualifier, but that will generate missing values where the condition is not true

    Code:
    bys year: asgen WM_invest_IF = kstock if size5 ==1 & kstock5==3 , w(mvalue)
    sort id year
    The difference between WM_invest and WM_invest_IF is that the former spreads the results within years, and the later does not. The use of this trick is borrowed from the behavior of egen. Further details related to this trick can be read here in Nick Cox's column.
    *

    Comments for improvement of asgen are greatly appreciated.
    Last edited by Attaullah Shah; 10 Oct 2017, 22:06.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

  • #2
    Dear Attaullah,

    Asgen has been really great additional STATA module. I was wondering though if it is possible to create equal-weighted portfolios through the use of asgen? Thank you in advance.

    Best regards,
    Kate

    Comment


    • #3
      If you do not use the weight option, asgen then uses equal weights, just like egen function.
      Regards
      --------------------------------------------------
      Attaullah Shah, PhD.
      Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
      FinTechProfessor.com
      https://asdocx.com
      Check out my asdoc program, which sends outputs to MS Word.
      For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

      Comment


      • #4
        I was wondering if asgen can be used to derive the long/short portfolio average excess returns, in which I would need to explicitly spread the means because there will be missing values for the other portfolio due to the portfolio sort.

        I gathered this code from other posts, as well as my own but I don't know how to actually make it work.
        bys ymdate : asgen ExUSD = F_Excess_USD_w(cond(IVOL_w_5 == 1, F_Excess_USD_w, .)), w(MV_USD_w)

        The F_Excess_USD_w before the parenthesis was mean, and I tried to replace it, but it doesnt work...

        Would really appreciate your feedback! Thank you!

        Comment


        • #5
          If you want to use if qualifier type statement and still do nto want have missing values where the condition is not met, then you can use this trick

          Code:
          bys ymdate : asgen ExUSD = (F_Excess_USD_w / (IVOL_w_5 == 1)), w(MV_USD_w)
          The help file of asgen provides the following material

          Example 3: Avoiding the use of [if] using an expression

          We use the [if] qualifier to perform calculations on a subset of the data. Using [if] condition, the resulting variable will have missing values where the condition is not true.
          There might be circumstances where we want to use [if] without encountering missing values in the new variable. The use of expression can come very handy in such as situation.
          To proceed with an example, let us use astile [can be downloaded from ssc by clicking here install astile] to make five groups of firms based on mvalue and five groups based
          kstock.

          Code:
          astile size5 = mvalue, nq(5)
              . astile kstock5 = kstock, nq(5)

          Now, let's find weighted average mean in each year for a combination of firms that are in the first quantile of mvalue (size == 1) and third quantile of kstock (kstock5 == 3)
          using mvalue as a weighting variable


          Code:
              . bys year: asgen WM_invest = (kstock / (size5 ==1 & kstock5==3)), w(mvalue)
          Off-course, we could have done that using [if] qualifier, but that will generate missing values where the condition is not true

          Code:
              . bys year: asgen WM_invest_IF = kstock if size5 ==1 & kstock5==3 , w(mvalue)
          The difference between WM_invest and WM_invest_IF is that the former spreads the results within years, and the later does not. The use of this trick is borrowed from the
          behavior of egen. Further details related to this trick can be read here https://ideas.repec.org/a/tsj/stataj...2p305-314.html in Nick Cox's column.
          *

          Regards
          --------------------------------------------------
          Attaullah Shah, PhD.
          Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
          FinTechProfessor.com
          https://asdocx.com
          Check out my asdoc program, which sends outputs to MS Word.
          For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

          Comment


          • #6
            Would this work too?

            Code:
            bys year : asgen WM_Invest_IF =  cond(size5 == 1 & kstock5 == 3, kstock .),  w(mvalue) 
            This and the other trick are discussed in Sections 9 and 10 of the paer cited in #5.

            Comment


            • #7
              Thank you to you both! It worked perfectly. I very much appreciate your help

              Comment


              • #8
                https://journals.sagepub.com/doi/pdf...867X1101100210 takes you straight to a .pdf (public access) of the paper mentioned in #5.

                Comment


                • #9
                  Originally posted by Attaullah Shah View Post
                  If you do not use the weight option, asgen then uses equal weights, just like egen function.
                  I have tried this, but the result is actually the same in both cases. Would I have to add anything to the code to make it work for equally-weighted average excess returns?

                  bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1)), w(MV_USD_w)
                  bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5)), w(MV_USD_w)
                  bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                  and

                  bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1))
                  bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5))
                  bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                  Thank you kindly.

                  Comment


                  • #10
                    Originally posted by Kate Lussy View Post

                    I have tried this, but the result is actually the same in both cases. Would I have to add anything to the code to make it work for equally-weighted average excess returns?



                    bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1)), w(MV_USD_w)
                    bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5)), w(MV_USD_w)
                    bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                    and

                    bys ymdate : asgen ExUSD_1_VW_1_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==1))
                    bys ymdate : asgen ExUSD_1_VW_5_Size = (F_Excess_USD_w / (nq_Size_w ==1 & IVOL_Size ==5))
                    bys ymdate : asgen EXUSD_1_VW_5_1_Size = ExUSD_1_VW_5 - ExUSD_1_VW_1

                    Thank you kindly.
                    I want to add upon my post. This was also the case when I used _gwtmean.

                    My procedure is to:
                    1. Create portfolios
                    2. Limit each portfolio to one observation per month
                    3. To generate the value-weighted excess returns for each portfolio
                    4. To find t-statistics

                    I am wondering if it makes sense to move 2. to after 3.? If I do so, I then generate different values for the for equal and value weighting. I tried to look at the data to check if it is right, but I cannot make sense of it. So, it would be great to hear someone's feedback!


                    *Generate Future Excess Returns
                    tsset DSCode1 ymdate
                    gen F_Excess_USD_w=F.Excess_USD_w
                    drop if F_Excess_USD_w==.

                    *Univariate Portfolio Analysis
                    * Portfolio sorting according to IVOL
                    ssc install astile
                    astile IVOL_w_5 = IVOL_w, nq(5)

                    sort ymdate IVOL_w_5 DSCode1
                    by ymdate IVOL_w_5: drop if _n >1


                    * Value-weighted average excess returns of each portfolio in each month (ymdate)
                    bys ymdate: asgen ExUSD_1_VW = (F_Excess_USD_w / (IVOL_w_5 == 1)), w(MV_USD_w)

                    *Equal-weighted average excess returns of each portfolio in each month (ymdate)
                    bys ymdate: asgen ExUSD_1_EW = (F_Excess_USD_w / (IVOL_w_5 == 1))

                    sort ymdate IVOL_w_5 DSCode1
                    by ymdate IVOL_w_5: drop if _n >1


                    tsset DSCode1 ymdate
                    sort DSCode1 ymdate

                    *Value-weighted average excess returns & t-statistics of each portfolio
                    newey ExUSD_1_VW, lag(6) force
                    newey ExUSD_1_EW, lag(6) force

                    [CODE]
                    * Example generated by -dataex-. To install: ssc install dataex
                    clear
                    input long DSCode1 float(ymdate MV_USD_w F_Excess_USD_w IVOL_w_5 ExUSD_1_EW ExUSD_1_VW)
                    1 559 151.45 .07484736 1 .020173784 .034087826
                    1 560 163.49 .0005981706 4 .006313993 .003041933
                    1 561 164.27 .012566957 4 .005149428 -.01948735
                    1 562 329.46 .019797264 3 .0251439 .0203678
                    1 563 337.62 .05277631 3 .03560889 .02851063
                    1 564 356.64 .07170374 3 -.005529334 -.03746695
                    1 565 383.81 .03556859 4 .016249549 .015531148
                    1 566 399.09 -.026278155 3 .033919852 .05264597
                    1 567 390.26 -.012818905 3 .011069474 -.005305402
                    1 568 380.41 -.012739185 3 -.009749626 -.0354712
                    1 569 377.02 -.04112721 3 -.00739424 -.04183678
                    1 570 363.43 -.04535 3 -.018767476 -.009455224
                    1 571 348.14 .02038562 3 -.012077605 -.005213041
                    1 572 356.64 -.07103478 3 .008945227 -.01197008
                    1 573 461.65 -.1154449 3 -.03793382 -.025115095
                    1 574 459.08 .0089275865 3 -.0042141736 -.04437515
                    1 575 464.68 -.2218975 2 -.03883289 -.09056226
                    1 576 362.99 .1751152 3 -.003365763 -.04764247
                    1 577 427.05 -.0969147 4 -.012286625 -.033523023
                    1 578 386.48 -.03043021 4 .017590337 .0431491
                    1 579 375 .0028491446 4 .013230603 .02295336
                    1 580 376.34 -.083706 4 -.025947643 -.045251
                    1 581 341.64 -.0954448 3 -.006803245 .035611454


                    Last edited by Kate Lussy; 09 May 2019, 02:02.

                    Comment


                    • #11
                      Hello Kate, were you able to calculate value-weighted returns? Can you help me in understanding that how do we exactly assign weights to the portfolios we created?

                      Comment


                      • #12
                        Dear Attaullah Shah

                        How does asgen deal with missing values?

                        Comment


                        • #13
                          See the following example to understand how asgen handles missing values.
                          Code:
                          webuse grunfeld
                          keep if company == 1
                          
                          * Without missing values either in the given variable or the weight variables
                          asgen nomiss_mean = invest, weight(mvalue)
                          
                          * Add a missing value to the given variable
                          replace invest = . in 1
                          asgen miss1_mean = invest, weight(mvalue)
                          
                          * When the the weighting variable has a missing value
                          replace mvalue = .  in 10
                          asgen miss2_mean = invest, weight(mvalue)
                          
                          list
                          
                               +-------------------------------------------------------------------------------------+
                               | company   year   invest   mvalue   kstock   time   nomiss_~n   miss1_~n   miss2_m~n |
                               |-------------------------------------------------------------------------------------|
                            1. |       1   1935        .   3078.5      2.8      1   649.50206   661.7243   668.03872 |
                            2. |       1   1936    391.8   4661.7     52.6      2   649.50206   661.7243   668.03872 |
                            3. |       1   1937    410.6   5387.1    156.9      3   649.50206   661.7243   668.03872 |
                            4. |       1   1938    257.7   2792.2    209.2      4   649.50206   661.7243   668.03872 |
                            5. |       1   1939    330.8   4313.2    203.4      5   649.50206   661.7243   668.03872 |
                               |-------------------------------------------------------------------------------------|
                            6. |       1   1940    461.2   4643.9    207.2      6   649.50206   661.7243   668.03872 |
                            7. |       1   1941      512   4551.2    255.2      7   649.50206   661.7243   668.03872 |
                            8. |       1   1942      448   3244.1    303.7      8   649.50206   661.7243   668.03872 |
                            9. |       1   1943    499.6   4053.7    264.1      9   649.50206   661.7243   668.03872 |
                           10. |       1   1944    547.5        .    201.6     10   649.50206   661.7243   668.03872 |
                               |-------------------------------------------------------------------------------------|
                           11. |       1   1945    561.2   4840.9      265     11   649.50206   661.7243   668.03872 |
                           12. |       1   1946    688.1   4900.9    402.2     12   649.50206   661.7243   668.03872 |
                           13. |       1   1947    568.9   3526.5    761.5     13   649.50206   661.7243   668.03872 |
                           14. |       1   1948    529.2   3254.7    922.4     14   649.50206   661.7243   668.03872 |
                           15. |       1   1949    555.1   3700.2   1020.1     15   649.50206   661.7243   668.03872 |
                               |-------------------------------------------------------------------------------------|
                           16. |       1   1950    642.9   3755.6     1099     16   649.50206   661.7243   668.03872 |
                           17. |       1   1951    755.9     4833   1207.7     17   649.50206   661.7243   668.03872 |
                           18. |       1   1952    891.2   4924.9   1430.5     18   649.50206   661.7243   668.03872 |
                           19. |       1   1953   1304.4   6241.7   1777.3     19   649.50206   661.7243   668.03872 |
                           20. |       1   1954   1486.7   5593.6   2226.3     20   649.50206   661.7243   668.03872 |
                               +-------------------------------------------------------------------------------------+
                          Findings:

                          1. If the variable for which mean is calculated (call it focal variable) has missing values, rows having missing values are dropped from the calculation. However, the newly generated variable reports the mean values even for observations with missing values in the focal variable, just like Stata's egen command.

                          2. Similarly, if the weighting variable has missing values, rows having missing values are dropped from the calculation. However, the newly generated variable reports the mean values even for observations with missing values in the weighting variable.
                          Last edited by Attaullah Shah; 13 Jul 2021, 20:41.
                          Regards
                          --------------------------------------------------
                          Attaullah Shah, PhD.
                          Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                          FinTechProfessor.com
                          https://asdocx.com
                          Check out my asdoc program, which sends outputs to MS Word.
                          For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                          Comment


                          • #14
                            I have a doubt on asgen command. If my weight variable is varying, how to then use the asgen command?

                            Comment


                            • #15
                              I have sorted it. Thank you. Your command has really come of great help, professor.

                              Comment

                              Working...
                              X