calculating hit rate after logistic regression

eric chen

Join Date: Jul 2015

Posts: 11
#1

calculating hit rate after logistic regression

08 Sep 2015, 08:03

hi,

I'd like to know how to calculate hit rate after a logistic regression. I have used the survey set in my data.

Also, in terms of checking the predictive power of logistic model, is there any difference between k fold cross validation and hit rate?

Thanks
Tags: None
Andrew Lover

Join Date: Apr 2014

Posts: 182
#2

08 Sep 2015, 13:11

"hit rate" is perhaps analogous to ROC analysis? see

http://www.stata.com/statalist/archi.../msg00493.html

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#3

08 Sep 2015, 13:19

it would be helpful, to me at least, if you would define what you mean by "hit rate" - and a citation to a technical article would also be appreciated
Comment
eric chen

Join Date: Jul 2015

Posts: 11
#4

09 Sep 2015, 13:28

Originally posted by Rich Goldstein View Post

it would be helpful, to me at least, if you would define what you mean by "hit rate" - and a citation to a technical article would also be appreciated

Sorry for the ambiguity.

Here's the definition:

"The (insample) hit rate is defined as the percentage of the observations (in-sample) that is correctly predicted by the model."

Sorry I couldn't find a technical paper on this concept. But here's a relevant example:

http://www.jstor.org/stable/1392289

Thanks
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#5

09 Sep 2015, 13:56

"estat classification" is what it appears you want - it is NOT the same thing as k-fold-cross-validation; after estimating your logistic model, just type

Code:

estat classifcation

it appears that what you want is the line marked "Correctly classified"
Comment
eric chen

Join Date: Jul 2015

Posts: 11
#6

09 Sep 2015, 14:24

Originally posted by Rich Goldstein View Post

"estat classification" is what it appears you want - it is NOT the same thing as k-fold-cross-validation; after estimating your logistic model, just type

Code:

estat classifcation

it appears that what you want is the line marked "Correctly classified"

Hi,

Thanks for the information. I have tried the command with my data but it didn't work. I think it's because I'm using a survey design. Hence the regression was

Code:

svy: logistic A B

What I'm trying to do here is to find out the predictive power of the model, i.e. whether it's a good model or not.

Apologies for my lack of technical knowledge.

Thanks
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#7

09 Sep 2015, 14:42

first, note that "whether it's a good model or not" is not a simple consideration for any model; for logistic regression, many people look at the model using "discrimination" (area under the ROC curve) and calibration - I'm not sure how to get discrimination for a survey design (I don't use surveys much myself) but you can use "estat gof" after your command to get some information at least; note for this test that the null hypothesis is that you have a good fit so a p-value <0.05 means you reject that null; there is probably a fairly simple way to get your "hit rate" following a "svy" but I don't immediately see what it is; and, yes, "estat classification" cannot be used after "svy"; please see the FAQ for advice on how to write questions
Comment
Andrew Lover

Join Date: Apr 2014

Posts: 182
#8

09 Sep 2015, 16:10

You might also look into the Hosmer Lemeshow gof test, etc:

http://www.stata-journal.com/sjpdf.h...iclenum=st0099

http://www.statalist.org/forums/foru...on-survey-data

__________________________________________________ __
Assistant Professor, Department of Biostatistics and Epidemiology
School of Public Health and Health Sciences
University of Massachusetts- Amherst
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#9

09 Sep 2015, 18:03

estat gof after svy: logistic is the Hosmer-Lemeshow test of fit.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
David Radwin

Join Date: Mar 2014

Posts: 368
#10

09 Sep 2015, 18:20

Percent correctly predicted is not necessarily a particularly good indication of goodness-of-fit, particularly for samples where successes or failures are rare. For example, if only 0.01% of applicants are accepted to a school, the model "nobody is accepted" (a constant with no parameters) would yield 99.99% correct predictions.

One alternative goodness-of-fit measure:
Herron, M. C. (1999). Postestimation uncertainty in limited dependent variable models. Political Analysis, 8(1), 83-98. http://www.polmeth.wustl.edu/analysis/vol/8/herron.pdf

David Radwin
Senior Researcher, California Competes
californiacompetes.org
Pronouns: He/Him
1 like
Comment

Announcement

calculating hit rate after logistic regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment