Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • John Harperson
    started a topic Calculate market betas for stocks

    Calculate market betas for stocks

    Hi all,
    I just started using STATA, so I don't have much knowledge about it.
    My problem is the following:
    I have monthly return data for all NYSE stocks for 40 years and have to calculate an individual beta for each stock on a rolling basis.
    So for a particular stock and a particular month, I want to regress the returns from the last 5 years on the market returns.
    For example stock Apple: beta in January 1980 would be a regression of Apple's returns from January 1975 to December 1979 on the market returns during this period.
    The beta for February 1980 would then include date from February 1975 to January 1980.
    I know that I have to work with the following command: rolling _b, window(60) saving(betas_`i', replace): regress stockexcessreturn marketexcessreturn
    However, I don't know how to apply this command exactly to my problem so that STATA is calculating the beta for each month (is that already included in the rolling command?).
    And how can I include that STATA should should just use the returns from a particular stock to calculate its beta and not from other stocks? I think you can't combine the command rolling with by companyID, can you?
    I have to include another restriction where I am unsure how to code this: STATA should just calculate the beta for a particular stock if at least 30 return observation were available during the last 60 months (the 5year period).
    I know that these are a lot of questions! However, it would make me really happy if someone can help me out...
    Thank you very, very much in advance!

  • Clyde Schechter
    replied
    I know I have to perform some sort of loop to repeat the command for all stocks.
    No, you don't. What you have to do is get your data into a more useful layout, and then the -rolling- command (or several other approaches that you can find in many threads here--this is a very frequently asked question) will handle it.

    Code:
    rename (NESTLER-whatever) ret=
    reshape long ret, i(Date) j(stock) string
    xtset stock Date
    In the above code, replace whatever by the name of the last stock before EUROSTOXX.

    From there, the -rolling- command, modified as follows:

    Code:
    rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg ret EUROSTOXX, robust
    will do it.

    -rolling- is by now pretty old, and there are other community-contributed commands that can be used to do this more efficiently. You can find models of these earlier in this thread. What all of them have in common, and also with -rolling-, is that the require the data to be in long layout. Your wide layout is easy on human eyes, but it doesn't play well with most Stata commands. When dealing with panel data like this, you should form the habit of working with it in long layout--that will save you much time and frustration. Wide layout is best for a handful of Stata commands, but only a handful.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.



    Leave a comment:


  • Jan Scheur
    replied
    Goodafternoon,

    I want to calculate the betas over all the stocks in my sample, there are 600 stocks in the sample. I have stock return data for 18 years for every stock. I already calculated the beta of one stock, however, now I want to do this for the stocks in my sample. I used the following to calculate the beta:

    rolling _b, window(365) stepsize(365) saving(beta_practice, replace): reg NESTLER EUROSTOXX, robust

    I know I have to perform some sort of loop to repeat the command for all stocks. I have daily stock returns, the idea is that I get a beta for every stock for every year. A simplified sample looks like this:
    Date NESTLER (...) EUROSTOXX
    03jan2000 x x x
    04jan2000 x x x
    (...) (...) (...) (...)
    31dec2017 x x x
    Between NESTLER and EUROSTOXX are 599 other stock returns. EUROSTOXX is the index.
    Thanks in advance.

    Regards,
    Jan Scheur

    Leave a comment:


  • Arjan Trinks
    replied
    @#31

    Hi Robert,

    Many thanks for your quick solution. This seems to work perfectly for the problem I posted. (#30)
    Will save me hours of waiting (and coffee consumption).
    Again, thanks so much!

    Leave a comment:


  • Johanna Moser
    replied
    Thank you so much Clyde! You are brilliant!

    Leave a comment:


  • Clyde Schechter
    replied
    Yes. This suggests that there are some groups where the -newey- command cannot be executed as requested. In order to see what is going on, you can add the -verbose- option to the -runby- command. This will generate a lot of output, but you can break it off after you get some error messages coming out and you'll have a sense of what's going on. It's likely a problem with your data, perhaps groups that do not have enough non-missing observations to run with the number of lags you have calculated. (You have calculated the number of lags on the basis of the number of non-missing values of ex_return, but the estimation sample might actually be smaller than that due to missing values in the other variables.)

    Leave a comment:


  • Johanna Moser
    replied
    Thank you so so much Clyde!!!!

    I adjusted the code:
    Code:
    capture program drop my_newey
    program define my_newey
        tsset week
     quietly summarize ex_return
     scalar lags = floor(4*(r(N)/100)^(2/9))
        newey ex_return mkt_rf hml smb twexb_change_percentage,lag(`=lags')
        if e(N) >= 40 {
            gen n_obs = e(N)
            foreach v of varlist mkt_rf hml smb twexb {
                gen b_`v' = _b[`v']
                gen se_`v' = _se[`v']
       gen t_`v' = _b[`v'] / _se[`v']
       gen p_`v' = 2*ttail(e(df_r), abs(_b[`v'] / _se[`v']))
            }
        }
        exit
    end
    
    gen year = yofd(dofw(week))
    runby my_newey, by(company_id year) status
    Only the number of errors that it displays worries me a little. Do you think there is something wrong?

    Click image for larger version

Name:	Stata.PNG
Views:	1
Size:	5.2 KB
ID:	1422421


    Again, thank you so much for your help!

    Leave a comment:


  • Clyde Schechter
    replied
    Here's the gist of it. Bear in mind that I don't know anything about -newey-. I discovered that you have to specify a lag() option to make it run. So I arbitrarily picked -lag(1)- to illustrate the approach. Change that to whatever you need. I also don't know which output from this regression you would consider "the beta." So I generated the regression coefficients and standard errors for all variables. Again, tailor this code to your actual needs.

    Code:
    capture program drop my_newey
    program define my_newey
        tsset week
        newey ex_return mkt_rf hml smb twexb, lag(1)
        if e(N) >= 40 {
            gen n_obs = e(N)
            foreach v of varlist mkt_rf hml smb twexb {
                gen b_`v' = _b[`v']
                gen se_`v' = _se[`v']
            }
        }
        exit
    end
    
    gen year = yofd(dofw(week))
    runby my_newey, by(company_id year) status
    The key points are:

    1. -runby-, is written by Robert Picard and me, and you can install it with -ssc install runby-. It's purpose is precisely for projects like this: run the same commands on chunks of the data set and capture results as you go along. You could also do this with official -statsby-, but -runby- is much faster and, in my opinion, simpler to use.

    2. The simplest way to determine whether 40 observations are available is to actually run -newey- and see how many it says it had. So that's the approach I used. -newey- seems to run very quickly on a set of up to 52 observations, so I don't think this will create much inefficiency compared to explicitly counting observations with missing values.

    3. The data that -newey- is applied to in this approach is not the original data set, but is a chunk of a clone of the original data set. That chunk does not carry with it the information about how the data are -tsset- or -xtset-, so the -tsset- needs to be done inside the program my_newey. (I guess this means that it isn't really a full "clone" of the original.)

    Leave a comment:


  • Johanna Moser
    replied
    Hi Clyde,
    first of all, thank you for your answer!!!!


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int company_id float(week ex_return) double(mkt_rf smb hml twexb_change_percentage)
    6 1820  -1.922182   .17 -1.21  1.85                  .
    6 1821  4.0626664  1.07   -.3   .13 -.0319025921651096
    6 1822   3.007111  -.22   .47  -.96 -.5542856415845588
    6 1823 -1.3971034   .79 -1.21   .09  .2684526551012281
    6 1824 -3.1557686  1.76 -1.23   .11  .0166934967665756
    6 1825   1.702802   .83  1.09  -.77  -.398238226404705
    6 1826      -.099  -.03  -.37   .66  .2480528809463563
    6 1827      -.099   .95  -.66   .87 -.4569744501027921
    6 1828  1.6549115  -.33   .75 -1.05 -.2111390276694827
    6 1829  1.6241304   .47  -.84  -.74 -.4712981869436382
    6 1830      -.115   .95  -.44  -.56 -.1046784368881651
    6 1831  1.5944017    .9  -.54  -.23  .2399777486939216
    6 1832  2.4060085   .09   .54  1.36 -.5425828016846495
    6 1833  3.1676884   .71 -1.18  1.41  -.874923088734785
    6 1834  2.2699523   .73   .69  -.24 -1.114025020836144
    6 1835  -5.537357  -.49  -.38  1.53 -.9998190772054948
    6 1836   2.757852  1.15   .25  -.53 -.1408511913849036
    6 1837 -2.1260319   .68 -1.17   .18 -.0387217579454937
    6 1838  2.3050244  1.28   -.1   .65 -.3285368222462772
    6 1839      -.134 -1.06  1.35   .26  1.493984971669278
    6 1840  -4.895905   .68  -.58    .4  .4127188016588153
    6 1841     -3.868  1.36 -1.24   .72 -1.093935111206834
    6 1842 -1.8496017  -.58  2.14 -1.42   .458236478468209
    6 1843  2.0846431  2.16  -.13 -1.32 -.0730010972286157
    6 1844   6.778552  1.78  -.75   -.3 -.0450502304535142
    6 1845  -7.376065  -.83  1.13    .2 -.0694329860945689
    6 1846   2.495696  2.21  -.67   .37 -.1279917376069615
    6 1847  -2.655373   .84  1.45 -1.13   .259751211377123
    6 1848  -.9825652 -1.41   .28   .02  .0174858453188828
    6 1849  -.9901929  2.14   .64  -.96  .0220194612202905
    6 1850  -2.770867   -.8    .5  1.36  .1129493497945198
    6 1851  -5.570546  -.32   .99  -.39  .6373720117661943
    6 1852  -4.442923  1.16   .68   -.5  1.553472254247672
    6 1853      -.116     0     0   .68  1.090733736844261
    6 1854   5.922151   .64  -.44  2.01 -.0910189631752859
    6 1855  2.2616682  1.91   .61 -1.06 -.0330792971773431
    6 1856  -7.052444  1.31 -1.33   .14  .5664963964832207
    6 1857  2.8770745  -.35   -.1 -1.71  .8852092428921295
    6 1858 -3.4896426   .15 -1.08  1.45 -1.404561685347355
    6 1859     -1.118  -.98  -2.2  1.41 -.0493520007365284
    6 1860   6.952707   .42  -.76   .63  .1843313036539588
    6 1861   -24.6463   .34    .1 -1.41  .0902320588945803
    6 1862     -6.368 -1.66  -1.2  -.52 -.0712446993730011
    6 1863   6.561666  2.23   .17 -1.08  .6997006016826614
    6 1864      4.895   .25   .35  -.16  .6612963318752312
    6 1865   5.252143   .71 -1.03   .45  .1477350975779138
    6 1866  -5.189746   -.3   -.8  1.58 -.1752095354420207
    6 1867  -7.264857  1.46   .67  -.58  .1490418287527331
    6 1868   6.288256  1.14  -.78  -.38  .3260769967213287
    6 1869 -1.3268193   -.7   .16   .58  .4610708980489516
    6 1870  2.3170245  -.34   .75   .53  .0933166276018613
    6 1871   7.020857   .78   .29  -.18 -.2994205773346552
    6 1872  2.1152222  -.48  -.19   .59  .2020980500156001
    6 1873 -3.3678696 -2.32   .12   .79  .3911540455130725
    6 1874  1.5783932  1.28 -1.63  -.12   .153957456110667
    6 1875   .4454862  1.67   .16 -1.02  .4569010137881136
    6 1876  -.3727253  2.27  -.41  -.82  .4881379781279293
    6 1877   5.755884   2.7 -1.78  -.71  -.385689339900815
    6 1878  -2.244938  -.82   1.1   .59  .0982943704509014
    6 1879 -2.2256596  1.61  -.41 -1.33 -.2600022923156712
    6 1880 -1.1849566 -2.02  1.61  1.58 -.3573739433812891
    6 1881 -2.2958024 -1.55   .11  -.26  .3787079875696428
    6 1882  -7.963169  1.42   .62 -1.04  .0257729824619537
    6 1883   5.999561  1.14  -.01   .67 -.1300734599436812
    6 1884  -2.971563  -.52  1.17    .9  .1297248010610148
    6 1885   5.211444   1.4  -.08  -.33 -.0456346003402419
    6 1886 -3.4847865  -2.7  2.03  -.53  .1548764870368509
    6 1887  -6.509349  1.49  1.18 -1.41  .2939756694820437
    6 1888  10.445005  1.79  1.32 -1.52 -.0497799068507639
    6 1889      -.106 -1.38  2.01  -.04  .0653751129112623
    6 1890 -3.1958876  1.37  -.29  -.22 -.0057706853306983
    6 1891   8.879507  2.62   .23  -.95  .1038783304443812
    6 1892  10.000383  1.12   .41   -.1   .337460133789865
    6 1893      -.106 -1.27   .86   .07  .3939867768188034
    6 1894   .3830918   .17 -1.35  -.16 -.2448665906312937
    6 1895 -1.0615385 -1.05   .16   .96  .3271194636593105
    6 1896  -3.012621  -.86 -2.46   .57 -.1892189129051771
    6 1897       -2.1   .58   .03   .21  .1150970498324158
    6 1898  3.9696326 -1.98   .47    .8 -.0010219087006377
    6 1899  -4.033569 -2.62 -2.91  2.93  .3228242574991119
    6 1900  -6.234449 -1.07   .41  -.19 -.2889855456295752
    6 1901   .4314783  -.87 -1.91   .86 -.5526763941798342
    6 1902  -4.967865  3.88 -1.28  -.91 -.0426313069936908
    6 1903 -1.2393637  -.06   .77  -.01 -.1184936406014964
    6 1904  -5.850127   .61  -.33   .78  .0928083795060269
    6 1905  1.1165122    .3   .42   .19  .0793587973647026
    6 1906   4.716277 -1.54  2.47  -.14  -.157461874813189
    6 1907 -2.4078505   .28     0   .17  .1498915678020208
    6 1908 -1.2854706   3.4  -1.5 -1.43  .3435052948506073
    6 1909 -4.2756667   .86   -.2 -1.11  .2215313302014492
    6 1910  12.934478   .19   .79  -.91 -.0955055204233619
    6 1911   8.685209  1.88 -1.42   .77  .2315794639625562
    6 1912 -13.237313  -.33  -.02    .6 -.0087725472314976
    6 1913  2.2195814   .94 -1.31   .77  .2505496074960811
    6 1914  -5.219636 -1.66  -.28  2.11  .1090870977031732
    6 1915  1.0956048   .04 -1.66  1.08 -.0147391725530304
    6 1916  -1.877148  3.32 -2.04  -.89 -.1623581272594788
    6 1917   15.56065   .78  -.48   .48 -.5026317906461209
    6 1918    -1.6645  1.12 -1.01  1.47 -.0740971962808138
    6 1919   .4271005  1.06   .01  -.02  .2193838202896143
    end
    format %tw week

    I hope this gives a clearer idea of what my data looks like. In total, i have 2137 companys with weekly obeservations for the time period 1995-2015.
    What I have to do is a Newey-Regression which creates yearly Beta per Company. It should NOT be rolling, hence i want to do a separate single regression for each company's data for each calender year.

    Additionally companies need to have at least 40 non-missing-values in the respective year, otherwise no regression should be run in the respective year.
    Thank you so much!
    Last edited by Johanna Moser; 14 Dec 2017, 02:44.

    Leave a comment:


  • Clyde Schechter
    replied
    I don't understand what you are trying to do. I assume you have multiple observations per company per year, and several years of data for many companies. Is that right? Is your data daily, monthly, or what? These questions could be answered by posting a short representative example of your data using the -dataex- command. If you are using Stata version 15.1, -dataex- is part of your official installation. If you have an earlier version, run -ssc install dataex- to get the command. Then run -help dataex- to read the simple instructions for using it.

    Most unclear is whether you want to do a rolling regression, in which each observation receives results of a regression for the observations obtained in the preceding years, or whether you simply want to do a separate single regression for each company's data for each calendar year.

    Leave a comment:


  • Johanna Moser
    replied
    Hi,
    I have a similar Problem to the one discussed in this thread, yet I am not supposed to use the rolling command.
    My data is long, with 2137 companys and the respective Company_id from 1 to 2137 over a time period from 1995 to 2015 (weekyl data).
    I do have to calculate a Regression which Looks like this:

    Weekly-Excess-Return= b1*Marketfactor+b2*Hml+b3*Smb+b4*Exchange_Rate

    (newey ex_return mkt_rf hml smb twexb)

    Now I fail to compute a Loop, which calculates one yearly Beta per Company.
    I also do have to exclude those companys, which do have more than 12 missing values per year

    I would be more than happy if anyone of you could help me!

    Leave a comment:


  • Robert Picard
    replied
    rangestat has been upgraded since #11 was posted and now includes a (reg) statistic. If I understand what you are asking and shooting blind since I cannot see your data or the code you tried, the following will calculate a regression using a 12 month window only for observations that are in June.

    Code:
    * create panel of monthly stocks returns over 30 years
    clear
    set seed 312312
    set obs 1000
    gen long stockid = _n
    expand 360
    bysort stockid: gen mdate = ym(1986,12) + _n
    format %tm mdate
    gen ret = runiform()
    gen mret = runiform()
    
    * define a valid upper interval bound only when the month is june
    gen high = cond(month(dofm(mdate)) == 6, mdate, -999)
    
    * regress over a 12 month window
    rangestat (reg) ret mret, by(stockid) interval(mdate -11 high)
    
    * spot check results
    list if stockid == 2 & mdate == ym(2001,6)
    reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))
    and the results for the spot check:
    Code:
    . list if stockid == 2 & mdate == ym(2001,6)
    
            +---------------------------------------------------------------------------------+
       534. | stockid |  mdate |      ret |     mret | high | reg_nobs |   reg_r2 | reg_adj~2 |
            |       2 | 2001m6 | .7165378 | .6462917 |  497 |       12 | .1522346 | .06745806 |
            |---------------------------------------------------------------------------------|
            |        b_mret     |        b_cons     |       se_mret      |       se_cons      |
            |     .31540325     |     .37679786     |     .23536795      |     .16262095      |
            +---------------------------------------------------------------------------------+
    
    . reg ret mret if stockid == 2 & inrange(mdate, ym(2000,7), ym(2001,6))
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      1.80
           Model |  .097820681         1  .097820681   Prob > F        =    0.2099
        Residual |  .544744687        10  .054474469   R-squared       =    0.1522
    -------------+----------------------------------   Adj R-squared   =    0.0675
           Total |  .642565368        11  .058415033   Root MSE        =     .2334
    
    ------------------------------------------------------------------------------
             ret |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            mret |   .3154033    .235368     1.34   0.210    -.2090292    .8398357
           _cons |   .3767979   .1626209     2.32   0.043     .0144558    .7391399
    ------------------------------------------------------------------------------

    Leave a comment:


  • Arjan Trinks
    replied
    Hi there,

    I was wondering whether it would be possible to specify a 'stepsize' in the Rangestat command, similar to that in Rolling.
    My method is identical to that in this post (#1): I have a panel of monthly returns for a large number of stocks, and need to estimate the market beta of each stock (slope of regression of stock's excess return on the market excess return).
    Currently, I've adopted the Rangestat code in #11 and it works well.

    However, the code takes very long to run due to the large panel as well as the fact that I'm running the code several times for robustness analyses (e.g., using different window sizes/intervals).
    The thing is that I only need Rangestat to calculate the beta for each stock in June (month==6).
    With a stepsize=12 option, the market betas would be calculated for June in year1 and then move on to June in year2, etc.

    Is there a way to do so? Any help is appreciated!

    Leave a comment:


  • Michalis Samarinas
    replied
    Thank you for all your advise Professor. I will see those from the start.
    Regarding the code I used verbose to check whether it works or not. You newer version is correct but still it saves (at least in my original dataset) only the first year for the first company.
    Probably there is something that I do not do right.

    Due to that i went towards a different direction to acquire results. Given that i use exret as my dependent variable,
    I generated a variable to populate in order to restrict the results and asking for at least 11 obs per regression.
    Code:
    *create a variable measuring obs
    egen n_obs = count(exret), by (gvkey fyear)
    Then I used the statsby command to acquire the beta. I believe that it probably works
    Code:
    statsby _b, basepop(n_obs>10) by(gvkey fyear) ///
    saving(C:\beta results.dta, replace) : reg exret rmrf
    Thank you very much for all your valuable input and advise.

    Leave a comment:


  • Clyde Schechter
    replied
    Well, first of all, you have introduced some new variable monyear2 that was not in your original data and tried to -xtset- the data with it. Let me point out that to do this with -runby-, there is no need to -xtset- the data at all, least of all with a non-existent variable.

    Removing that, there is another problem, and it is more severe: you are also regressing a non-existent variable exret inside program yearly_beta. So none of the regressions actually run, and, so, there is nothing left in your data at the end:

    Code:
    . * Example generated by -dataex-. To install: ssc install dataex
    . clear
    
    . input long gvkey float(date fyear return rf rmrf)
    
                gvkey       date      fyear     return         rf       rmrf
      1. 1045 14610 2000   -.21918684 .41  -4.74
      2. 1045 14641 2000  -.017575145 .43   2.45
      3. 1045 14670 2000   -.50610864 .47    5.2
      4. 1045 14701 2000    .06637507 .46   -6.4
      5. 1045 14731 2000     -.178293  .5  -4.42
      6. 1045 14762 2000   -.07512063  .4   4.64
      7. 1045 14792 2000    .22361626 .48  -2.51
      8. 1045 14823 2000  -.007590169  .5   7.03
      9. 1045 14854 2000 -.0038167986 .51  -5.45
     10. 1045 14884 2000  .0019102202 .56  -2.76
     11. 1045 14915 2000   .020775063 .51 -10.72
     12. 1045 14945 2000     .1586798  .5   1.19
     13. 1045 14976 2001  -.002491135 .54   3.13
     14. 1045 15007 2001    -.1618119 .38 -10.05
     15. 1045 15035 2001    .05471597 .42  -7.26
     16. 1045 15066 2001      .081706 .39   7.94
     17. 1045 15096 2001   .022828516 .32    .72
     18. 1045 15127 2001   -.07618167 .28  -1.94
     19. 1045 15157 2001  -.027498914  .3  -2.13
     20. 1045 15188 2001   -.09420131 .31  -6.46
     21. 1045 15219 2001   -.51364297 .28  -9.25
     22. 1045 15249 2001   -.05035872 .22   2.46
     23. 1045 15280 2001     .1600984 .17   7.54
     24. 1045 15310 2001     .0430666 .15   1.61
     25. 1050 14610 2000   -.10536052 .41  -4.74
     26. 1050 14641 2000     .3285041 .43   2.45
     27. 1050 14670 2000   -.22314355 .47    5.2
     28. 1050 14701 2000    .04879016 .46   -6.4
     29. 1050 14731 2000   -.18232156  .5  -4.42
     30. 1050 14762 2000   .014184635  .4   4.64
     31. 1050 14792 2000  -.014184635 .48  -2.51
     32. 1050 14823 2000   .028170876  .5   7.03
     33. 1050 14854 2000   -.05715841 .51  -5.45
     34. 1050 14884 2000     -.194156 .56  -2.76
     35. 1050 14915 2000  -.074107975 .51 -10.72
     36. 1050 14945 2000    -.1670541  .5   1.19
     37. 1050 14976 2001     .3429447 .54   3.13
     38. 1050 15007 2001  -.032789823 .38 -10.05
     39. 1050 15035 2001   -.06899287 .42  -7.26
     40. 1050 15066 2001    .13353139 .39   7.94
     41. 1050 15096 2001    .04879012 .32    .72
     42. 1050 15127 2001   -.04879012 .28  -1.94
     43. 1050 15157 2001            0  .3  -2.13
     44. 1050 15188 2001    .04879012 .31  -6.46
     45. 1050 15219 2001    .05556991 .28  -9.25
     46. 1050 15249 2001     .6375773 .22   2.46
     47. 1050 15280 2001    -.1766235 .17   7.54
     48. 1050 15310 2001   -.06453853 .15   1.61
     49. end
    
    . format %d date
    
    . format %ty fyear
    
    . 
    . capture program drop yearly_beta
    
    . program define yearly_beta
      1.     regress exret rmrf
      2.     foreach v of varlist rf rmrf {
      3.         gen fy_b_`v' = _b[`v']
      4.         gen fy_se_`v' = _se[`v']
      5.     }
      6.     gen fy_reg_nobs = e(N)
      7.     gen fy_reg_r2 = e(r2)
      8.     gen fy_reg_adj_r2 = e(r2_a)
      9.     exit
     10. end
    
    . 
    . runby yearly_beta, by(gvkey fyear) verbose
    variable exret not found
    variable exret not found
    variable exret not found
    variable exret not found
    
    --------------------------------------
    Number of by-groups    =             4
    by-groups with errors  =             4
    by-groups with no data =             0
    Observations processed =            48
    Observations saved     =             0
    --------------------------------------
    The variables you use inside the program yearly_beta have to be the actual variables in your data set. So if we change exret back to rf, the way it was in #25, there is still a problem, and this one is my error. It isn't possible to calculate b[rf] and se[rf] because rf is the dependent variable. So the code should be:

    Code:
    . capture program drop yearly_beta
    
    . program define yearly_beta
      1.     regress rf rmrf
      2.         gen fy_b_rmrf = _b[rmrf]
      3.         gen fy_se_rmrf = _se[rmrf]
      4.     gen fy_reg_nobs = e(N)
      5.     gen fy_reg_r2 = e(r2)
      6.     gen fy_reg_adj_r2 = e(r2_a)
      7.     exit
      8. end
    
    . 
    . runby yearly_beta, by(gvkey fyear) verbose
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      0.81
           Model |  .001767413         1  .001767413   Prob > F        =    0.3897
        Residual |  .021857585        10  .002185758   R-squared       =    0.0748
    -------------+----------------------------------   Adj R-squared   =   -0.0177
           Total |  .023624998        11  .002147727   Root MSE        =    .04675
    
    ------------------------------------------------------------------------------
              rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            rmrf |  -.0023347   .0025963    -0.90   0.390    -.0081197    .0034503
           _cons |   .4742918   .0139598    33.98   0.000     .4431874    .5053961
    ------------------------------------------------------------------------------
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      0.31
           Model |  .003920557         1  .003920557   Prob > F        =    0.5915
        Residual |   .12754611        10  .012754611   R-squared       =    0.0298
    -------------+----------------------------------   Adj R-squared   =   -0.0672
           Total |  .131466668        11  .011951515   Root MSE        =    .11294
    
    ------------------------------------------------------------------------------
              rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            rmrf |  -.0030828   .0055603    -0.55   0.591    -.0154719    .0093064
           _cons |   .3098164   .0332133     9.33   0.000     .2358126    .3838202
    ------------------------------------------------------------------------------
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      0.81
           Model |  .001767413         1  .001767413   Prob > F        =    0.3897
        Residual |  .021857585        10  .002185758   R-squared       =    0.0748
    -------------+----------------------------------   Adj R-squared   =   -0.0177
           Total |  .023624998        11  .002147727   Root MSE        =    .04675
    
    ------------------------------------------------------------------------------
              rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            rmrf |  -.0023347   .0025963    -0.90   0.390    -.0081197    .0034503
           _cons |   .4742918   .0139598    33.98   0.000     .4431874    .5053961
    ------------------------------------------------------------------------------
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(1, 10)        =      0.31
           Model |  .003920557         1  .003920557   Prob > F        =    0.5915
        Residual |   .12754611        10  .012754611   R-squared       =    0.0298
    -------------+----------------------------------   Adj R-squared   =   -0.0672
           Total |  .131466668        11  .011951515   Root MSE        =    .11294
    
    ------------------------------------------------------------------------------
              rf |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            rmrf |  -.0030828   .0055603    -0.55   0.591    -.0154719    .0093064
           _cons |   .3098164   .0332133     9.33   0.000     .2358126    .3838202
    ------------------------------------------------------------------------------
    
    --------------------------------------
    Number of by-groups    =             4
    by-groups with errors  =             0
    by-groups with no data =             0
    Observations processed =            48
    Observations saved     =            48
    --------------------------------------
    . list in 1/5, noobs clean
    
        gvkey        date   fyear      return    rf    rmrf   fy_b_rmrf   fy_se_~f   fy_reg~s   fy_~g_r2   fy_r~j_r2  
         1045   01jan2000    2000   -.2191868   .41   -4.74   -.0023347   .0025963         12   .0748111   -.0177078  
         1045   01feb2000    2000   -.0175751   .43    2.45   -.0023347   .0025963         12   .0748111   -.0177078  
         1045   01mar2000    2000   -.5061086   .47     5.2   -.0023347   .0025963         12   .0748111   -.0177078  
         1045   01apr2000    2000    .0663751   .46    -6.4   -.0023347   .0025963         12   .0748111   -.0177078  
         1045   01may2000    2000    -.178293    .5   -4.42   -.0023347   .0025963         12   .0748111   -.0177078  
    
    . list in -5/l, noobs clean
    
        gvkey        date   fyear      return    rf    rmrf   fy_b_rmrf   fy_se_~f   fy_reg~s   fy_~g_r2   fy_r~j_r2  
         1050   01aug2001    2001    .0487901   .31   -6.46   -.0030828   .0055603         12   .0298217   -.0671961  
         1050   01sep2001    2001    .0555699   .28   -9.25   -.0030828   .0055603         12   .0298217   -.0671961  
         1050   01oct2001    2001    .6375773   .22    2.46   -.0030828   .0055603         12   .0298217   -.0671961  
         1050   01nov2001    2001   -.1766235   .17    7.54   -.0030828   .0055603         12   .0298217   -.0671961  
         1050   01dec2001    2001   -.0645385   .15    1.61   -.0030828   .0055603         12   .0298217   -.0671961
    I take it you are playing around with different data sets. When you change the variable names in the data, you have to change them correspondingly in yearly_beta as well.

    By the way, if you have a lot of different gvkey/fyear combinations, you probably don't want all the regressions flying across your screen. So remove the -verbose- option from the -runby- command and you will just get the progress report and results summary, with the results you want left in the data set.

    To restrict your results to those gvkey/fyear combinations where there are at least 10 observations:

    Code:
    foreach v of varlist fy_b_rmrf fy_se_rmrf fy_*r2 {
        replace `v' = . if fy_n_obs < 10
    }

    This last is just basic data management in Stata. If you are not familiar with the use of -foreach-, replace-, and -if- then you need to step back from your project and take the time to learn the fundamentals of using Stata. Click on PDF Documentation in Stata's Help menu. Then follow the links for the Getting Started [GS] and User's Guide [U] volumes. Read them in their entirety. It's a lot of material, and you won't retain everything. But you will be exposed to the overall approach that Stata uses for data management and analysis, and the routine commands that are involved in using Stata. From there you will generally be able to figure out which commands are likely to play a role in solving your problems, and you can refer back to the help files for details that you don't remember.

    Leave a comment:

Working...
X