Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Internal validation/Correction for optimism in linear regression using Harrell's method

    Hi:

    I'm in the process of validating internally a linear regression model using the method described by Harrell. Basically:

    1) Generate a random bootstrap sample
    2) Compute the regression coefficients and use that bootstrap model to compute R-square (or Adj R-square) on the bootstrap sample and on the original sample (<- my unsolved problem)
    3) Compute the difference between the bootsample R-square and original sample R-square
    4) Repeat at least 100 times
    5) Get an average of the difference between R-squares
    6) Use that average as a correction for optimism for the original R-square value obtained while developing the model

    I get an idea that I have to use -simulate- together with an rclass program written by me, but I'm stuck because I think all the code I see posted in Statalist and somewhere else does the opossite: apply the original model to the bootstrap samples (which I don't want to). Besides, I found code for logistic and Cox models (for Harrell's C statistic, not R-square)

    Any pointers?

    Thanks in advance

    Sorry for not posting code, I'm not even there yet...

    Marta

  • #2
    This is the closest I got, with a lot of red crosses when running it:

    Code:
    program define optimism, rclass
    preserve
    bsample
    regress DIF_PESO rs2605100_LYPLAL1 Edad rs4929949_STK33 rs1801133_MTHFR#c.Edad peso1 rs3813929_HTR2C rs659366_UCP2 rs1801133_MTHFR rs11030104_BDNF
    return scalar rsquare=e(er2)
    return scalar psquare=e(r2_adj)
    end
    tempfile sim_results
    simulate r2 = r(rsquare) r2adj=r(psquare), reps(200) seed(12345) saving(`sim_results'): optimism
    Marta

    Comment


    • #3
      I spotted part of the problem, and I blame it on me being really tired

      Part of the code should have been:
      Code:
       
       return scalar rsquare=e(r2) return scalar psquare=e(r2_a)
      But now I only have a set of r2 and adj. r2 that I don't know if they come from the bootsample or from the original sample when the botmodel is applied. Anyway, I'm still lacking the other pair, in order to compute the difference and get the correction for optimism

      Marta

      Comment

      Working...
      X