Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • store the R² values for each regression

    Hello Statalist,

    I want to run series of daily regression for each company. Thus, I use a loop over companies. The regression used:

    daily residual = a + b aggregateresidual + lag aggregateresidual + lead aggregateresidual + e


    Based on these regressions, I want to store the values of R² for each regression.
    I used the code as below however I got only missing value in R² value for each company. What I need actually is R² value for each firm separately.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str6 company float(bcaldate liquidity residual9 idc aggregateresidual)
    "AG:ALP" 503    -.0060211     .04673871 1 1.791005e-11
    "AG:ALP" 506  -.004895052             0 1 1.791005e-11
    "AG:ALP" 507  -.002620364             0 1 1.791005e-11
    "AG:ALP" 508   -.15307453    -.10031472 1 1.791005e-11
    "AG:ALP" 509   -.02777796   -.001915913 1 1.791005e-11
    "AG:ALP" 510  -.021299465   -.005525792 1 1.791005e-11
    "AG:ALP" 513   -.04469265    .008067165 1 1.791005e-11
    "AG:ALP" 514  -.027108947  -.0012468975 1 1.791005e-11
    "AG:ALP" 515   -.02216766   -.006393986 1 1.791005e-11
    "AG:ALP" 518  -.007250976     .04550884 1 1.791005e-11
    "AG:ALP" 519  -.022699237   .0031628124 1 1.791005e-11
    "AG:ALP" 520  -.003853896    .011919777 1 1.791005e-11
    "AG:ALU"   2 -.0041765217    .013984367 2 1.791005e-11
    "AG:ALU"   3  -.014315836 -.00006125495 2 1.791005e-11
    "AG:ALU"   4    -.0210848   -.012289435 2 1.791005e-11
    "AG:ALU"   5  -.008408533   -.002418835 2 1.791005e-11
    "AG:ALU"   6  -.003088765  -.0003082317 2 1.791005e-11
    "AG:ALU"   7   -.03531823    -.01715734 2 1.791005e-11
    "AG:ALU"   8   -.01170215    .002552432 2 1.791005e-11
    "AG:ALU"   9 -.0021240253    .006671338 2 1.791005e-11
    "AG:ALU"  10 -.0015961546    .004393544 2 1.791005e-11
    "AG:ALU"  11 -.0024723015   .0003082317 2 1.791005e-11
    "AG:ALU"  12  -.014987913    .003172976 2 1.791005e-11
    "AG:ALU"  13  -.016745757   -.002491176 2 1.791005e-11
    "AG:ALU"  14  -.003177268    .005618096 2 1.791005e-11
    "AG:ALU"  15  -.007964408  -.0019747093 2 1.791005e-11
    end
    format %tbmybcal bcaldate
    Code:
    egen group = group(company)
    gen r2 = .
    su group, meanonly
    
    forval g = 1/`r(max)' {
        capture reg residual9 aggregateresidual_L L.aggregateresidual_L F.aggregateresidual_L if group == `g'
        replace r2 = e(r2_a) if group == `g'  
    }
    I really appreciate if someone can help me to store the R2 value for each company.
    Thank you in advance.

  • #2
    You are not getting the R2's because the regressions themselves are not running. If you change -capture- to -capture noisiliy- you will see that they are not running because the variable aggregateresidual_L referred to in the command does not exist in the data. If you fix that, you will then find that the regressions do not run because you are using L. and F. operators without having -xtset- or -tsset- your data.

    After you fix that, you will find that your regressions run, but your R2 is always zero. This is because, at least in your example data, the "variable" aggregateresidual is actually a constant. So aggregateresidual and its lead and lag are colinear with the constant and all get dropped!

    I think this is a terrific example of why the indiscriminate use of -capture- is bad programming practice. The purpose of -capture- is to allow you to continue execution of a program in the event of expected, predictable conditions that preclude correct execution of a particular command or block of commands. In the context shown here, typically this is done because there may, for some groups, be insufficient observations to carry out the regression. The problem is that -capture- doesn't discriminate between that normal, expected problem and other errors. In this case you have made several other errors, and -capture- just blunders on through them without giving you any idea that things have gone badly wrong. At the end, you are left with no results and no clue why!

    So, in addition to making the fixes I've outlined already, you should rewrite the code so that after -capture- you check to make sure that the exceptional condition is the one you were expecting, and then grind to a halt if it isn't.

    Code:
    egen group = group(company)
    xtset group bcaldate
    gen r2 = .
    su group, meanonly
    
    forval g = 1/`r(max)' {
        capture reg residual9 aggregateresidual L.aggregateresidual F.aggregateresidual if group == `g'
        if !inlist(c(rc), 0, 2000, 2001) { // ERROR OTHER THAN no/insufficient observations
            display as error "Unexpected error encountered, group = `g'"
            error c(rc)
        }
        else if c(rc) == 0 {
            replace r2 = e(r2_a) if group == `g'
        }
    }
    General Principle: Code should be written not just to produce correct results when its assumptions are met, but to fail gracefully when they aren't and, if possible, to explain what went wrong.
    Last edited by Clyde Schechter; 23 Mar 2017, 10:41.

    Comment


    • #3
      Many thanks Clyde for the explanation.
      The sample data itself perhaps (I mean the aggregate residual is not correct.
      I will make correction and re-do again.
      Thank you for making it clear.

      Comment

      Working...
      X