Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression by stocks and output variables and t-values

    Hi all,

    I am running simple time-series regression to exam stock liquidity in financial market. This is a panel dataset and each stock has an unique ID number in ID column. The dataset contains 120 stocks in the past 3 years. Column "year" documents the year variable.

    My regression is like: y= a + bx + c

    Here, I want to run regression for each stock and extract the coefficients of b and its corresponding t -values.
    In the end I should get 120 coefficients and t-values.


    My is code is like:
    Code:
     
    gen beta_x=.
    
    qui levelsof ID if  ID > 0 & ID < 121 , local(year)
    
    foreach v of local year{
    
    qui reg y x,r if ID == `v'
    
    qui replace beta_x= _b[_x] if ID == `v'
    }
    
    **Use collapse to summarize data**
    collapse beta_x, by(ID) 
    outreg2 using c_1, bdec(3) stats (coef tstat) excel replace dec(2)

    However, I can't get the desired results. Can someone help me ? THANKS !

  • #2
    Well there is one clear problem in the code:

    Code:
    qui reg y x,r if ID == `v'
    should be

    Code:
    qui reg y x if ID == `v', r
    -if- qualifiers precede the comma that sets of options in Stata syntax.

    The other clear problem is:

    Code:
    qui replace beta_x= _b[_x] if ID == `v'
    should be
    Code:
    qui replace beta_x= _b[x] if ID == `v'
    The _b[] reference does begin with an underscore character, but you do not add an underscore before the name of the variable whose coefficient you are trying to access.

    With those changes, your code has a good chance of running. It will not, however, produce any t-statistics because you do nothing to create them. If you want those, the easiest way to get them is to save _se[x], and then calculate t as the quotient of beta and the standard error.

    Pitfalls may also arise if there are some ID's for which there are not enough observations to do the desired regression.

    In the future, when asking for help with code, please show a data example that goes with the code, and, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.




    Comment


    • #3
      Clyde gives good advice as always. I find it easier to store results in a matrix which can then be exported to Excel or some other editor. Based on your example, here is one way

      Code:
      qui levelsof ID if  ID > 0 & ID < 121 , local(year)
      local n: word count `year'
      matrix R = J(`n',2,.)
      
      local i=1
      foreach v of local year{
      qui reg y x if ID == `v',r 
      matrix R[`i',1]= _b[x]
      matrix R[`i',2]= round(_b[x]/_se[x], 0.01)
      local ++i
      }
      
      putexcel set "results.xls", sheet("Coefficients and t-statistics") 
      putexcel A1=("Coefficient")
      putexcel B1=("t-statistic")
      putexcel A2=matrix(R)

      Comment


      • #4
        Clyde Schechter Dear Mighty Clyde, Thank you so much. I will -dataex- next time to share the sample.

        I think your code works, but it gives me 'matrix e(b) not found; run/post a regression, or specify varlist for non-regression outputs'.

        I ran the normal regression first, y = a +bx +c , and ran your code. However, it only gave me the regression results for the whole sample. I am new to STATA. Could you please advise ? Thanks.

        Comment


        • #5
          Andrew Musau Hi Andrew, Thanks. The code works fine and 'results.xls' was generated. However, there is nothing in the xls file. I think there are some issues with my current STATA version. 6 months ago I could click open the xls results outputted from STATA, but now i cannot open the xls file. STATA only gives me warning message.

          Comment


          • #6
            Can you post the commands and errors that you get?

            'matrix e(b) not found; run/post a regression, or specify varlist for non-regression outputs'.
            This implies that your regressions did not run. For diagnostic purposes, it may also be useful to include the following lines

            Code:
            qui levelsof ID if  ID > 0 & ID < 121 , local(year)
            local n: word count `year'
            di `n'
            matrix R = J(`n',2,.)
            local i=1
            foreach v of local year{
            qui reg y x if ID == `v',r
            matrix R[`i',1]= _b[x]
            matrix R[`i',2]= round(_b[x]/_se[x], 0.01)
            local ++i
            }
            mat list R
            putexcel set "results.xls", sheet("Coefficients and t-statistics")
            putexcel A1=("Coefficient")
            putexcel B1=("t-statistic")
            putexcel A2=matrix(R)
            So if matrix R has all elements, then the issue arises when you are exporting output to Excel. Otherwise, there may be some problems with your implementation of the regressions.

            Comment


            • #7
              Andrew Musau Thank you very much. The code works perfectly !

              Can I ask you how to extract R square and adjusted R square from each regression ?
              I used the following

              Code:
              qui levelsof ID if ID > 0 & ID < 121 , local(year) local n: word count `year'
              di `n'
              matrix R = J(`n',4,.)
              local i=1
              foreach v of local year{
              qui reg y x if ID == `v',r
              matrix R[`i',1]= _b[x]
              matrix R[`i',2]= round(_b[x]/_se[x], 0.01)
              matrix R[`i',3]= e(r2)
              matrix R[`i',4]= e(r2_a)
              local ++i }
              
              mat list R
              However, the system says "r2 not found"

              Thank You Very much.

              Comment


              • #8
                Your syntax looks fine to me. Try running

                Code:
                foreach v of local year{
                reg y x if ID == `v',r
                di e(r2)
                di e(r2_a)
                }
                and see if you can spot the problem, i.e., if you still get the error.

                Comment


                • #9
                  Andrew Musau Thank you very much Andrew.

                  Comment


                  • #10
                    You can also use runby (from SSC) to run any number of commands on data subsets defined using by-groups. For each distinct value of ID, runby will run the myreg program defined below. Before running myreg, runby replaces the data in memory with the subset of observations defined using the current distinct value of ID. This is more efficient as there is no need to use the if qualifier to restrict commands to a particular subset and makes for simpler code. With runby, what's left in memory accumulates and when runby finishes processing all by-groups, the data in memory is replaced with the accumulated results.

                    Code:
                    * create a demonstration dataset with 120 stocks, each with 36 months of data
                    clear all
                    set seed 321
                    set obs 120
                    gen ID = _n
                    expand 36
                    bysort ID: gen time = _n
                    gen y = runiform()
                    gen x = runiform()
                    
                    * do a single case that we'll spot check later on
                    reg y x if ID == 21, robust
                    
                    program myreg
                        capture noisily reg y x, robust
                        keep in 1
                        keep ID
                        gen nobs = e(N)
                        gen b_x  = _b[x]
                        gen t_x  = _b[x] / _se[x]
                        gen r2   = e(r2)
                        gen r2a  = e(r2_a)
                    end
                    runby myreg, by(ID)
                    
                    * spot check one case
                    list if ID == 21
                    The myreg program runs the desired regression and then reduces the data to a single observation with a single variable (ID). The program then creates the desired variables using the estimation results. Here's the output:
                    Code:
                    . * do a single case that we'll spot check later on
                    . reg y x if ID == 21, robust
                    
                    Linear regression                               Number of obs     =         36
                                                                    F(1, 34)          =       0.08
                                                                    Prob > F          =     0.7835
                                                                    R-squared         =     0.0020
                                                                    Root MSE          =     .30277
                    
                    ------------------------------------------------------------------------------
                                 |               Robust
                               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                               x |  -.0509101   .1838154    -0.28   0.783    -.4244679    .3226478
                           _cons |   .5201195    .095969     5.42   0.000     .3250871     .715152
                    ------------------------------------------------------------------------------
                    
                    . 
                    . program myreg
                      1.         capture noisily reg y x, robust
                      2.         keep in 1
                      3.         keep ID
                      4.         gen nobs = e(N)
                      5.         gen b_x  = _b[x]
                      6.         gen t_x  = _b[x] / _se[x]
                      7.         gen r2   = e(r2)
                      8.         gen r2a  = e(r2_a)
                      9. end
                    
                    . runby myreg, by(ID)
                    
                    --------------------------------------
                    Number of by-groups    =           120
                    by-groups with errors  =             0
                    by-groups with no data =             0
                    Observations processed =         4,320
                    Observations saved     =           120
                    --------------------------------------
                    
                    . 
                    . * spot check one case
                    . list if ID == 21
                    
                         +----------------------------------------------------------+
                         | ID   nobs         b_x         t_x         r2         r2a |
                         |----------------------------------------------------------|
                     21. | 21     36   -.0509101   -.2769631   .0020239   -.0273283 |
                         +----------------------------------------------------------+

                    Comment

                    Working...
                    X