Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Performing Regressions on Panel and Time Series Data

    Good afternoon, first time poster here. I'm an undergraduate student trying to run some regressions on data I have been compiling. I thought this method of compiling would make it simple to regress, but it's proving to be more complicated than I thought.

    I am in the process of trying to use daily market data to calculate the ANNUAL Betas by Firm then calculate the error term for each firm_year combination. Ultimately, I'm trying to use daily stock return data (regressed on/controlled by the Fama-French 3-factors**) to calculate the annual systemic (Error Term) and systematic (Beta) risk for each firm.

    In order to do this, I have ~252 daily data points over 11 years (2011 - 2021), for 100 firms (this totals 277,200 data points). In the end, I hope to have one data point for every firm & year combination for each of the 100 firms (100 firms * 11 years = 1100 data points).

    I have been able to run the regression using "sort" and "by" to perform one regression per year-firm combination and receive the appropriate annual beta (displayed below). However, calculating the error term for each of these points is proving to be much more complicated. It seems as if using the "predict y_hat" and "predict residuals" doesn't break the results into each individual firm_year subset.

    Finally, if anybody has any advice on exporting the results into an excel or CSV file that contains the relevant Firm, Year, Systemic Risk, Systematic Risk compiled, that would be greatly appreciated (currently I'm planning on manually combining the datapoints).

    Below I have listed an example of my variables, as well code I'm currently using:



    Click image for larger version

Name:	Screenshot 2023-03-30 at 3.33.36 PM.png
Views:	1
Size:	59.4 KB
ID:	1707850



    Code:

    Click image for larger version

Name:	Screenshot 2023-03-30 at 3.34.42 PM.png
Views:	1
Size:	64.5 KB
ID:	1707851


    At this point it should provide me the difference squared of the (predicted - actual)^2. If I'm not mistaken, I would then need to sum that variable for each year_firm to get the error term (Annual Systemic Risk) for that firm.

    I hope I was clear enough in explaining my goals, and explain what I've been doing so far. I really appreciate your time and any support you may have to offer.

    Best!

  • #2
    All,

    Sorry for responding to my own post, but I simplified things a lot, and I'm hoping that someone may have the key to automating this process (so far, my attempts to create a loop have failed).

    **I updated my variables from the first post (now Num = compid and Year = year, otherwise I just made all variables in lowercase)

    Ultimately, I used the "sort" and "by" commands to get the "mktrf beta" for each company-year combination:

    Code:
    sort compid year
    by compid year: regress assetreturn mktrf smb hml,r
    That worked like a charm, but I can't use the "predict" command with the "by" command, so I went with a different method to try and isolate the residuals.

    Now, my primary goal is to create residuals that are specific to each company-year combination and then summarize the standard deviation of that year. So far, if I use the code below, I can get the residual standard deviation for each company-year combination. However, in this method, I'll have to update each company and rerun the regression 100 times (for all 100 companies).

    Does anybody have any advice in automating this process (aka when it runs all of these on compid == 2, it automatically progresses to 3, 4, 5, etc)?

    Additionally, can I use the matrix command to store the std_dev results and export them to an excel file? My thought is to use outreg

    Code:
    
    regress assetreturn mktrf smb hml if compid == 2 & year == 2011 *Regresses for compid 2, year 11 
    predict residuals *Predicts residuals for compid 2, year 11
     sum residuals *Summarizes the residuals to present the standard deviation
    drop residuals *Drops the residuals for this compid-year combination, so that the next years residuals are isolated
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2012
    predict residuals
     sum residuals
    drop residuals
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2013
    predict residuals
     sum residuals
    drop residuals 
     
     
    regress assetreturn mktrf hml smb if compid == 3 & year == 2014
    predict residuals
     sum residuals
    drop residuals  
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2015
    predict residuals
     sum residuals
    drop residuals 
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2016
    
    predict residuals
     sum residuals
    drop residuals 
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2017
    
    predict residuals
     sum residuals
    drop residuals 
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2018
    
    predict residuals
     sum residuals
    drop residuals 
    
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2019
    
    predict residuals
     sum residuals
    drop residuals 
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2020
    
    predict residuals
     sum residuals
    drop residuals 
    
    regress assetreturn mktrf hml smb if compid == 3 & year == 2021
    
    predict residuals
     sum residuals
    drop residuals

    Thank you all in advance!

    Comment

    Working...
    X