P value for area under the curve using Iroc command

Josna Rani

Join Date: Feb 2019

Posts: 20
#1

P value for area under the curve using Iroc command

23 Mar 2019, 03:16

I am using Stata 13. I have run a logistic regression to check the association of Obesity with several covariates. To assess the fit for the model I ran the Hosmer Lemeshow goodness of fit test and also a ROC curve. For the latter, I ran the Iroc command. This only gives the area under the curve but no p-value for it. Rom another post I found that "roctab" gives the 95% CI of the area under the curve statistic. I would like to know how I can get the significance level for the ROC.

----------------
. lroc

Logistic model for BMIcat5

number of observations = 359
area under ROC curve = 0.7817

Last edited by Josna Rani; 23 Mar 2019, 03:28.
Tags: None
ericmelse

Join Date: May 2014

Posts: 434
#2

23 Mar 2019, 09:07

For a recent test to evaluate the calibration of binary models I refer you to the following paper from The State Journal, and this presentation from the 2018 Stata Conference.
You can install the calibration belt package from the Stata command window:

Code:

net install gr0071

There is an example data set and do file that you can get typing:

Code:

net get gr0071

Note to first change your working directory to an appropriate folder before downloading.

http://publicationslist.org/eric.melse
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#3

23 Mar 2019, 20:31

Josna can use something like the following to get at what I'm guessing Josna's after. Start at the "Begin here" comment.

Code:

version 15.1 clear * set seed `=strreverse("1489649")' quietly set obs 250 generate byte outcome = runiform() < 0.5 forvalues i = 1/3 { generate double predictor`i' = runiform() } logit outcome c.(predictor?), nolog lroc , nograph * * Begin here * predict double xb, xb tempname null z scalar define `null' = 0.5 roctab outcome xb scalar define `z' = (r(area) - `null') / r(se) display in smcl as text "z = " %06.4f `z', "P>|z| = " %04.2f 2 * normal(-abs(`z')) exit

In lieu of the code above that follows lroc , nograph, I believe that Josna would do well just to omit the option and examine the graph that lroc produces. In several recent posts, Clyde Schechter has included a hyperlink to a recent editorial in a science magazine. It strikes me as pertinent in the context of "to know how I can get the significance level for the ROC", and I recommend that Josna search it out.

I believe that area under the receiver operating characteristic curve is more related to discrimination than calibration.
Comment
geraldine tran

Join Date: May 2019

Posts: 3
#4

09 May 2019, 15:20

Hi Joseph Coveney,
Can you explain or point to an explanation for "double" and the c.(predictor) ? What does c.() mean exactly and when to use?

Thank you,
Geraldine
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#5

09 May 2019, 16:58

"double" is a data-type. If not explicitly specified, the default data type in Stata is single-precision floating point. Specifying "double" in the code above makes the predictions double-precision floating point, the same as all other statistical software (and even Microsoft's Excel). My use of it there is to follow the principle of maintaining full precision in intermediate computations. You can find StataCorp's online helpfile for Stata's data types here.

Use of c.() is to specify that the predictors (independent variables) in the model are continuous and not categorical (not for example to be considered "dummy" variables). You can find StataCorp's online helpfile for how to specify predictors here.

The sequence of

Code:

logit disease . . . predict double xb, xb roctab disease xb

I think addresses your earlier question about obtaining confidence intervals for the AUC of the ROC with multiple classifying variables.
Comment

Announcement

P value for area under the curve using Iroc command

Comment

Comment

Comment

Comment