Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • John Harperson
    started a topic Calculate market betas for stocks

    Calculate market betas for stocks

    Hi all,
    I just started using STATA, so I don't have much knowledge about it.
    My problem is the following:
    I have monthly return data for all NYSE stocks for 40 years and have to calculate an individual beta for each stock on a rolling basis.
    So for a particular stock and a particular month, I want to regress the returns from the last 5 years on the market returns.
    For example stock Apple: beta in January 1980 would be a regression of Apple's returns from January 1975 to December 1979 on the market returns during this period.
    The beta for February 1980 would then include date from February 1975 to January 1980.
    I know that I have to work with the following command: rolling _b, window(60) saving(betas_`i', replace): regress stockexcessreturn marketexcessreturn
    However, I don't know how to apply this command exactly to my problem so that STATA is calculating the beta for each month (is that already included in the rolling command?).
    And how can I include that STATA should should just use the returns from a particular stock to calculate its beta and not from other stocks? I think you can't combine the command rolling with by companyID, can you?
    I have to include another restriction where I am unsure how to code this: STATA should just calculate the beta for a particular stock if at least 30 return observation were available during the last 60 months (the 5year period).
    I know that these are a lot of questions! However, it would make me really happy if someone can help me out...
    Thank you very, very much in advance!

  • Attaullah Shah
    replied
    For rolling betas, you can also explore asreg, see this video

    Leave a comment:


  • Rolf van Baar
    replied
    Hi there,

    I'm not very familiar with stata yet and would like some help!
    My problem is similar to that of the original post (#1), but I have some more problems.

    I have three different data sets, these are the following:
    • The file sp500inclusions contains information about when the stock of a company was included in the S&P 500 index. Permno is a stock security identifier. Dateann refers to the date when S&P announced the inclusion of a new stock and Dateeff to the date when the stock was actually included (usually a few days later).
    • The file sp500stockret contains daily stock returns for each stock in sp500inclusions. For each stock it covers the period starting 1 year prior to it's Dateann until 1 year after it's Dateeff.
    • The file sp500marketret contains the index return of the S&P 500 (sprtrn), which serves as a proxy for the market return, as well as the yield on short-term US government debt (rf), which serves as a proxy for the risk-free rate.
    For each stock I want to estimate pre- and post-inclusion betas. For pre-inclusion betas I will have to use 1 year of stock returns until 10 days prior to the inclusion announcement. For post-inclusion betas I will have to use 1 year of stock returns starting 10 days after effective inclusion.

    What's the best way to handle this? I find it especially tricky how to combine the data from the different sets and how to make the rolling window work in such a way that it takes exactly the right period for the pre- and post-inclusions.

    Thank you very much in advance for your help!

    Best,
    Rolf

    Leave a comment:


  • Clyde Schechter
    replied
    I know I have to perform some sort of loop to repeat the command for all stocks.
    No, you don't. What you have to do is get your data into a more useful layout, and then the -rolling- command (or several other approaches that you can find in many threads here--this is a very frequently asked question) will handle it.

    Code:
    rename (NESTLER-whatever) ret=
    reshape long ret, i(Date) j(stock) string
    xtset stock Date
    In the above code, replace whatever by the name of the last stock before EUROSTOXX.

    From there, the -rolling- command, modified as follows:

    Code:
    rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg ret EUROSTOXX, robust
    will do it.

    -rolling- is by now pretty old, and there are other community-contributed commands that can be used to do this more efficiently. You can find models of these earlier in this thread. What all of them have in common, and also with -rolling-, is that the require the data to be in long layout. Your wide layout is easy on human eyes, but it doesn't play well with most Stata commands. When dealing with panel data like this, you should form the habit of working with it in long layout--that will save you much time and frustration. Wide layout is best for a handful of Stata commands, but only a handful.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.



    Leave a comment:


  • Jan Scheur
    replied
    Goodafternoon,

    I want to calculate the betas over all the stocks in my sample, there are 600 stocks in the sample. I have stock return data for 18 years for every stock. I already calculated the beta of one stock, however, now I want to do this for the stocks in my sample. I used the following to calculate the beta:

    rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg NESTLER EUROSTOXX, robust

    I know I have to perform some sort of loop to repeat the command for all stocks. I have daily stock returns, the idea is that I get a beta for every stock for every year. A simplified sample looks like this:
    Date NESTLER (...) EUROSTOXX
    03jan2000 x x x
    04jan2000 x x x
    (...) (...) (...) (...)
    31dec2017 x x x
    Between NESTLER and EUROSTOXX are 599 other stock returns. EUROSTOXX is the index.
    Thanks in advance.

    Regards,
    Jan Scheur

    Leave a comment:


  • Arjan Trinks
    replied
    @#31

    Hi Robert,

    Many thanks for your quick solution. This seems to work perfectly for the problem I posted. (#30)
    Will save me hours of waiting (and coffee consumption).
    Again, thanks so much!

    Leave a comment:


  • Johanna Moser
    replied
    Thank you so much Clyde! You are brilliant!

    Leave a comment:


  • Clyde Schechter
    replied
    Yes. This suggests that there are some groups where the -newey- command cannot be executed as requested. In order to see what is going on, you can add the -verbose- option to the -runby- command. This will generate a lot of output, but you can break it off after you get some error messages coming out and you'll have a sense of what's going on. It's likely a problem with your data, perhaps groups that do not have enough non-missing observations to run with the number of lags you have calculated. (You have calculated the number of lags on the basis of the number of non-missing values of ex_return, but the estimation sample might actually be smaller than that due to missing values in the other variables.)

    Leave a comment:


  • Johanna Moser
    replied
    Thank you so so much Clyde!!!!

    I adjusted the code:
    Code:
    capture program drop my_newey
    program define my_newey
        tsset week
     quietly summarize ex_return
     scalar lags = floor(4*(r(N)/100)^(2/9))
        newey ex_return mkt_rf hml smb twexb_change_percentage,lag(`=lags')
        if e(N) >= 40 {
            gen n_obs = e(N)
            foreach v of varlist mkt_rf hml smb twexb {
                gen b_`v' = _b[`v']
                gen se_`v' = _se[`v']
       gen t_`v' = _b[`v'] / _se[`v']
       gen p_`v' = 2*ttail(e(df_r), abs(_b[`v'] / _se[`v']))
            }
        }
        exit
    end
    
    gen year = yofd(dofw(week))
    runby my_newey, by(company_id year) status
    Only the number of errors that it displays worries me a little. Do you think there is something wrong?

    Click image for larger version

Name:	Stata.PNG
Views:	1
Size:	5.2 KB
ID:	1422421


    Again, thank you so much for your help!

    Leave a comment:


  • Clyde Schechter
    replied
    Here's the gist of it. Bear in mind that I don't know anything about -newey-. I discovered that you have to specify a lag() option to make it run. So I arbitrarily picked -lag(1)- to illustrate the approach. Change that to whatever you need. I also don't know which output from this regression you would consider "the beta." So I generated the regression coefficients and standard errors for all variables. Again, tailor this code to your actual needs.

    Code:
    capture program drop my_newey
    program define my_newey
        tsset week
        newey ex_return mkt_rf hml smb twexb, lag(1)
        if e(N) >= 40 {
            gen n_obs = e(N)
            foreach v of varlist mkt_rf hml smb twexb {
                gen b_`v' = _b[`v']
                gen se_`v' = _se[`v']
            }
        }
        exit
    end
    
    gen year = yofd(dofw(week))
    runby my_newey, by(company_id year) status
    The key points are:

    1. -runby-, is written by Robert Picard and me, and you can install it with -ssc install runby-. It's purpose is precisely for projects like this: run the same commands on chunks of the data set and capture results as you go along. You could also do this with official -statsby-, but -runby- is much faster and, in my opinion, simpler to use.

    2. The simplest way to determine whether 40 observations are available is to actually run -newey- and see how many it says it had. So that's the approach I used. -newey- seems to run very quickly on a set of up to 52 observations, so I don't think this will create much inefficiency compared to explicitly counting observations with missing values.

    3. The data that -newey- is applied to in this approach is not the original data set, but is a chunk of a clone of the original data set. That chunk does not carry with it the information about how the data are -tsset- or -xtset-, so the -tsset- needs to be done inside the program my_newey. (I guess this means that it isn't really a full "clone" of the original.)

    Leave a comment:


  • Johanna Moser
    replied
    Hi Clyde,
    first of all, thank you for your answer!!!!


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int company_id float(week ex_return) double(mkt_rf smb hml twexb_change_percentage)
    6 1820  -1.922182   .17 -1.21  1.85                  .
    6 1821  4.0626664  1.07   -.3   .13 -.0319025921651096
    6 1822   3.007111  -.22   .47  -.96 -.5542856415845588
    6 1823 -1.3971034   .79 -1.21   .09  .2684526551012281
    6 1824 -3.1557686  1.76 -1.23   .11  .0166934967665756
    6 1825   1.702802   .83  1.09  -.77  -.398238226404705
    6 1826      -.099  -.03  -.37   .66  .2480528809463563
    6 1827      -.099   .95  -.66   .87 -.4569744501027921
    6 1828  1.6549115  -.33   .75 -1.05 -.2111390276694827
    6 1829  1.6241304   .47  -.84  -.74 -.4712981869436382
    6 1830      -.115   .95  -.44  -.56 -.1046784368881651
    6 1831  1.5944017    .9  -.54  -.23  .2399777486939216
    6 1832  2.4060085   .09   .54  1.36 -.5425828016846495
    6 1833  3.1676884   .71 -1.18  1.41  -.874923088734785
    6 1834  2.2699523   .73   .69  -.24 -1.114025020836144
    6 1835  -5.537357  -.49  -.38  1.53 -.9998190772054948
    6 1836   2.757852  1.15   .25  -.53 -.1408511913849036
    6 1837 -2.1260319   .68 -1.17   .18 -.0387217579454937
    6 1838  2.3050244  1.28   -.1   .65 -.3285368222462772
    6 1839      -.134 -1.06  1.35   .26  1.493984971669278
    6 1840  -4.895905   .68  -.58    .4  .4127188016588153
    6 1841     -3.868  1.36 -1.24   .72 -1.093935111206834
    6 1842 -1.8496017  -.58  2.14 -1.42   .458236478468209
    6 1843  2.0846431  2.16  -.13 -1.32 -.0730010972286157
    6 1844   6.778552  1.78  -.75   -.3 -.0450502304535142
    6 1845  -7.376065  -.83  1.13    .2 -.0694329860945689
    6 1846   2.495696  2.21  -.67   .37 -.1279917376069615
    6 1847  -2.655373   .84  1.45 -1.13   .259751211377123
    6 1848  -.9825652 -1.41   .28   .02  .0174858453188828
    6 1849  -.9901929  2.14   .64  -.96  .0220194612202905
    6 1850  -2.770867   -.8    .5  1.36  .1129493497945198
    6 1851  -5.570546  -.32   .99  -.39  .6373720117661943
    6 1852  -4.442923  1.16   .68   -.5  1.553472254247672
    6 1853      -.116     0     0   .68  1.090733736844261
    6 1854   5.922151   .64  -.44  2.01 -.0910189631752859
    6 1855  2.2616682  1.91   .61 -1.06 -.0330792971773431
    6 1856  -7.052444  1.31 -1.33   .14  .5664963964832207
    6 1857  2.8770745  -.35   -.1 -1.71  .8852092428921295
    6 1858 -3.4896426   .15 -1.08  1.45 -1.404561685347355
    6 1859     -1.118  -.98  -2.2  1.41 -.0493520007365284
    6 1860   6.952707   .42  -.76   .63  .1843313036539588
    6 1861   -24.6463   .34    .1 -1.41  .0902320588945803
    6 1862     -6.368 -1.66  -1.2  -.52 -.0712446993730011
    6 1863   6.561666  2.23   .17 -1.08  .6997006016826614
    6 1864      4.895   .25   .35  -.16  .6612963318752312
    6 1865   5.252143   .71 -1.03   .45  .1477350975779138
    6 1866  -5.189746   -.3   -.8  1.58 -.1752095354420207
    6 1867  -7.264857  1.46   .67  -.58  .1490418287527331
    6 1868   6.288256  1.14  -.78  -.38  .3260769967213287
    6 1869 -1.3268193   -.7   .16   .58  .4610708980489516
    6 1870  2.3170245  -.34   .75   .53  .0933166276018613
    6 1871   7.020857   .78   .29  -.18 -.2994205773346552
    6 1872  2.1152222  -.48  -.19   .59  .2020980500156001
    6 1873 -3.3678696 -2.32   .12   .79  .3911540455130725
    6 1874  1.5783932  1.28 -1.63  -.12   .153957456110667
    6 1875   .4454862  1.67   .16 -1.02  .4569010137881136
    6 1876  -.3727253  2.27  -.41  -.82  .4881379781279293
    6 1877   5.755884   2.7 -1.78  -.71  -.385689339900815
    6 1878  -2.244938  -.82   1.1   .59  .0982943704509014
    6 1879 -2.2256596  1.61  -.41 -1.33 -.2600022923156712
    6 1880 -1.1849566 -2.02  1.61  1.58 -.3573739433812891
    6 1881 -2.2958024 -1.55   .11  -.26  .3787079875696428
    6 1882  -7.963169  1.42   .62 -1.04  .0257729824619537
    6 1883   5.999561  1.14  -.01   .67 -.1300734599436812
    6 1884  -2.971563  -.52  1.17    .9  .1297248010610148
    6 1885   5.211444   1.4  -.08  -.33 -.0456346003402419
    6 1886 -3.4847865  -2.7  2.03  -.53  .1548764870368509
    6 1887  -6.509349  1.49  1.18 -1.41  .2939756694820437
    6 1888  10.445005  1.79  1.32 -1.52 -.0497799068507639
    6 1889      -.106 -1.38  2.01  -.04  .0653751129112623
    6 1890 -3.1958876  1.37  -.29  -.22 -.0057706853306983
    6 1891   8.879507  2.62   .23  -.95  .1038783304443812
    6 1892  10.000383  1.12   .41   -.1   .337460133789865
    6 1893      -.106 -1.27   .86   .07  .3939867768188034
    6 1894   .3830918   .17 -1.35  -.16 -.2448665906312937
    6 1895 -1.0615385 -1.05   .16   .96  .3271194636593105
    6 1896  -3.012621  -.86 -2.46   .57 -.1892189129051771
    6 1897       -2.1   .58   .03   .21  .1150970498324158
    6 1898  3.9696326 -1.98   .47    .8 -.0010219087006377
    6 1899  -4.033569 -2.62 -2.91  2.93  .3228242574991119
    6 1900  -6.234449 -1.07   .41  -.19 -.2889855456295752
    6 1901   .4314783  -.87 -1.91   .86 -.5526763941798342
    6 1902  -4.967865  3.88 -1.28  -.91 -.0426313069936908
    6 1903 -1.2393637  -.06   .77  -.01 -.1184936406014964
    6 1904  -5.850127   .61  -.33   .78  .0928083795060269
    6 1905  1.1165122    .3   .42   .19  .0793587973647026
    6 1906   4.716277 -1.54  2.47  -.14  -.157461874813189
    6 1907 -2.4078505   .28     0   .17  .1498915678020208
    6 1908 -1.2854706   3.4  -1.5 -1.43  .3435052948506073
    6 1909 -4.2756667   .86   -.2 -1.11  .2215313302014492
    6 1910  12.934478   .19   .79  -.91 -.0955055204233619
    6 1911   8.685209  1.88 -1.42   .77  .2315794639625562
    6 1912 -13.237313  -.33  -.02    .6 -.0087725472314976
    6 1913  2.2195814   .94 -1.31   .77  .2505496074960811
    6 1914  -5.219636 -1.66  -.28  2.11  .1090870977031732
    6 1915  1.0956048   .04 -1.66  1.08 -.0147391725530304
    6 1916  -1.877148  3.32 -2.04  -.89 -.1623581272594788
    6 1917   15.56065   .78  -.48   .48 -.5026317906461209
    6 1918    -1.6645  1.12 -1.01  1.47 -.0740971962808138
    6 1919   .4271005  1.06   .01  -.02  .2193838202896143
    end
    format %tw week

    I hope this gives a clearer idea of what my data looks like. In total, i have 2137 companys with weekly obeservations for the time period 1995-2015.
    What I have to do is a Newey-Regression which creates yearly Beta per Company. It should NOT be rolling, hence i want to do a separate single regression for each company's data for each calender year.

    Additionally companies need to have at least 40 non-missing-values in the respective year, otherwise no regression should be run in the respective year.
    Thank you so much!
    Last edited by Johanna Moser; 14 Dec 2017, 01:44.

    Leave a comment:


  • Clyde Schechter
    replied
    I don't understand what you are trying to do. I assume you have multiple observations per company per year, and several years of data for many companies. Is that right? Is your data daily, monthly, or what? These questions could be answered by posting a short representative example of your data using the -dataex- command. If you are using Stata version 15.1, -dataex- is part of your official installation. If you have an earlier version, run -ssc install dataex- to get the command. Then run -help dataex- to read the simple instructions for using it.

    Most unclear is whether you want to do a rolling regression, in which each observation receives results of a regression for the observations obtained in the preceding years, or whether you simply want to do a separate single regression for each company's data for each calendar year.

    Leave a comment:


  • Johanna Moser
    replied
    Hi,
    I have a similar Problem to the one discussed in this thread, yet I am not supposed to use the rolling command.
    My data is long, with 2137 companys and the respective Company_id from 1 to 2137 over a time period from 1995 to 2015 (weekyl data).
    I do have to calculate a Regression which Looks like this:

    Weekly-Excess-Return= b1*Marketfactor+b2*Hml+b3*Smb+b4*Exchange_Rate

    (newey ex_return mkt_rf hml smb twexb)

    Now I fail to compute a Loop, which calculates one yearly Beta per Company.
    I also do have to exclude those companys, which do have more than 12 missing values per year

    I would be more than happy if anyone of you could help me!

    Leave a comment:


  • Robert Picard
    replied
    rangestat has been upgraded since #11 was posted and now includes a (reg) statistic. If I understand what you are asking and shooting blind since I cannot see your data or the code you tried, the following will calculate a regression using a 12 month window only for observations that are in June.

    Code:
    * create panel of monthly stocks returns over 30 years
    clear
    set seed 312312
    set obs 1000
    gen long stockid = _n
    expand 360
    bysort stockid: gen mdate = ym(1986,12) + _n
    format %tm mdate
    gen ret = runiform()
    gen mret = runiform()
    
    * define a valid upper interval bound only when the month is june
    gen high = cond(month(dofm(mdate)) == 6, mdate, -999)
    
    * regress over a 12 month window
    rangestat (reg) ret mret, by(stockid) interval(mdate -11 high)
    
    * spot check results
    list if stockid == 2 & mdate == ym(2001,6)
    reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))
    and the results for the spot check:
    Code:
    . list if stockid == 2 & mdate == ym(2001,6)
    
            +---------------------------------------------------------------------------------+
       534. | stockid |  mdate |      ret |     mret | high | reg_nobs |   reg_r2 | reg_adj~2 |
            |       2 | 2001m6 | .7165378 | .6462917 |  497 |       12 | .1522346 | .06745806 |
            |---------------------------------------------------------------------------------|
            |        b_mret     |        b_cons     |       se_mret      |       se_cons      |
            |     .31540325     |     .37679786     |     .23536795      |     .16262095      |
            +---------------------------------------------------------------------------------+
    
    . reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      1.80
           Model |  .097820681         1  .097820681   Prob > F        =    0.2099
        Residual |  .544744687        10  .054474469   R-squared       =    0.1522
    -------------+----------------------------------   Adj R-squared   =    0.0675
           Total |  .642565368        11  .058415033   Root MSE        =     .2334
    
    ------------------------------------------------------------------------------
             ret |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            mret |   .3154033    .235368     1.34   0.210    -.2090292    .8398357
           _cons |   .3767979   .1626209     2.32   0.043     .0144558    .7391399
    ------------------------------------------------------------------------------

    Leave a comment:


  • Arjan Trinks
    replied
    Hi there,

    I was wondering whether it would be possible to specify a 'stepsize' in the Rangestat command, similar to that in Rolling.
    My method is identical to that in this post (#1): I have a panel of monthly returns for a large number of stocks, and need to estimate the market beta of each stock (slope of regression of stock's excess return on the market excess return).
    Currently, I've adopted the Rangestat code in #11 and it works well.

    However, the code takes very long to run due to the large panel as well as the fact that I'm running the code several times for robustness analyses (e.g., using different window sizes/intervals).
    The thing is that I only need Rangestat to calculate the beta for each stock in June (month==6).
    With a stepsize=12 option, the market betas would be calculated for June in year1 and then move on to June in year2, etc.

    Is there a way to do so? Any help is appreciated!

    Leave a comment:

Working...
X