Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • P value for area under the curve using Iroc command

    I am using Stata 13. I have run a logistic regression to check the association of Obesity with several covariates. To assess the fit for the model I ran the Hosmer Lemeshow goodness of fit test and also a ROC curve. For the latter, I ran the Iroc command. This only gives the area under the curve but no p-value for it. Rom another post I found that "roctab" gives the 95% CI of the area under the curve statistic. I would like to know how I can get the significance level for the ROC.

    ----------------
    . lroc

    Logistic model for BMIcat5

    number of observations = 359
    area under ROC curve = 0.7817





    Last edited by Josna Rani; 23 Mar 2019, 03:28.

  • #2
    For a recent test to evaluate the calibration of binary models I refer you to the following paper from The State Journal, and this presentation from the 2018 Stata Conference.
    You can install the calibration belt package from the Stata command window:
    Code:
    net install gr0071
    There is an example data set and do file that you can get typing:
    Code:
    net get gr0071
    Note to first change your working directory to an appropriate folder before downloading.
    http://publicationslist.org/eric.melse

    Comment


    • #3
      Josna can use something like the following to get at what I'm guessing Josna's after. Start at the "Begin here" comment.
      Code:
      version 15.1
      
      clear *
      
      set seed `=strreverse("1489649")'
      quietly set obs 250
      
      generate byte outcome = runiform() < 0.5
      forvalues i = 1/3 {
          generate double predictor`i' = runiform()
      }
      
      logit outcome c.(predictor?), nolog
      lroc , nograph
      
      *
      * Begin here
      *
      predict double xb, xb
      
      tempname null z
      scalar define `null' = 0.5
      
      roctab outcome xb
      scalar define `z' = (r(area) - `null') / r(se)
      
      display in smcl as text "z = " %06.4f `z', "P>|z| = " %04.2f 2 * normal(-abs(`z'))
      
      exit
      In lieu of the code above that follows lroc , nograph, I believe that Josna would do well just to omit the option and examine the graph that lroc produces. In several recent posts, Clyde Schechter has included a hyperlink to a recent editorial in a science magazine. It strikes me as pertinent in the context of "to know how I can get the significance level for the ROC", and I recommend that Josna search it out.

      I believe that area under the receiver operating characteristic curve is more related to discrimination than calibration.

      Comment


      • #4
        Hi Joseph Coveney,
        Can you explain or point to an explanation for "double" and the c.(predictor) ? What does c.() mean exactly and when to use?

        Thank you,
        Geraldine

        Comment


        • #5
          "double" is a data-type. If not explicitly specified, the default data type in Stata is single-precision floating point. Specifying "double" in the code above makes the predictions double-precision floating point, the same as all other statistical software (and even Microsoft's Excel). My use of it there is to follow the principle of maintaining full precision in intermediate computations. You can find StataCorp's online helpfile for Stata's data types here.

          Use of c.() is to specify that the predictors (independent variables) in the model are continuous and not categorical (not for example to be considered "dummy" variables). You can find StataCorp's online helpfile for how to specify predictors here.

          The sequence of
          Code:
          logit disease . . .
          predict double xb, xb
          roctab disease xb
          I think addresses your earlier question about obtaining confidence intervals for the AUC of the ROC with multiple classifying variables.

          Comment

          Working...
          X