LR chi2 and Pseudo-R^2 - Enough to assess model fit?

Kim Veloso

Join Date: Jun 2018

Posts: 19
#1

LR chi2 and Pseudo-R^2 - Enough to assess model fit?

21 Jun 2018, 19:20

Dear all,

Is it sufficient to conclude a logit model's fit based on the LR chi2, prob > chi2, and pseudo-R^2 (/McFadden's R^2)? Or must I run other tests?

Data used: Labor Force Survey

Code:

logit Y i.sex i.education i.sec3 i.urbrur i.marital i.age_grp

Edit: I'm still not sure whether or not weights should be included in the logit regression, so I have posted the weighted version on Stata below as well.

Code:

logit Y i.sex i.education i.sec3 i.urbrur i.marital i.age_grp[pw=round(weight)]

Any pointers are highly appreciated!

Thank you!

Last edited by Kim Veloso; 21 Jun 2018, 20:17.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

22 Jun 2018, 00:22

Kim:
see also -help estat gof-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Kim Veloso

Join Date: Jun 2018

Posts: 19
#3

22 Jun 2018, 04:42

Originally posted by Carlo Lazzaro View Post

Kim:
see also -help estat gof-.

Thank you so much, Sir Carlo! I appreciate your help!

I have two follow up questions, if I may.

First, after running my unweighted logit regression,

Code:

. logit new_occgrp i.sex i.education i.sec3 i.urbrur i.marital i.age_grp, robust

I ran your suggested command and got the following output from Stata:

Code:

. estat gof, group(10) number of observations = 35600 number of groups = 10 Hosmer Lemeshow chi2(8) = 124.03 Prob > chi2 = 0.0000

Does a significant Hosmer Lemeshow suggest a "bad" fit?

Secondly, I also tried to run

Code:

linktest

If my unweighted logit regression, fails the linktest (i.e._hatsq is significant) and/or fails the McFadden R^2 (i.e. below 0.2), but the weighted logit regression passes both (_hatsq not significant and McFadden R^2 is above 0.2), does it mean that I should be using the weighted logit regression instead?

Thank you very much once again!
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#4

22 Jun 2018, 05:20

re: Hosmer-Lemeshow - yes, a statistically significant result shows a problem; note, however, that this test can be "too powerful"; see #13 in https://www.statalist.org/forums/for...on-survey-data
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#5

22 Jun 2018, 13:31

the OP sent me a private message with a follow-up question:

Dear Mr. Goldstein,

Thank you very much for answering my question. I have also been reading about Hosmer-Lemeshow and realize that I need to assess it with caution..

A follow up question, if I may:

Must I use svyset if I want to run a logit regression using labor force survey data?

Thank you very much once again.

Best,
Kim

first, please don't do that

second, I am not an expert on the data you are talking about - in general, the answer is "it depends" - in at least some situations, you can use "pweights" with logistic regression and they do not require the use of -svyset-
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5008
#6

22 Jun 2018, 16:02

As luck would have it, this old thread on pweights vs svy came back to life the other day:

https://www.statalist.org/forums/for...-are-available

Personally, one way or another, I usually use the weights. Having said that, there are arguments for NOT weighting. But I think you have to understand things really well to not weight. See the Appx of

https://www3.nd.edu/~rwilliam/stats3/SvyCautionsX.pdf

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Kim Veloso

Join Date: Jun 2018

Posts: 19
#7

23 Jun 2018, 21:35

Thank you very much, Mr. Goldstein, and Mr. Williams! I highly appreciate your input.

After some more reading and consultations, I have decided to run the following regression using the Labor Force Survey data.
It follows the suggested approach in the UNC Carolina Population Center: http://www.cpc.unc.edu/research/tool...rveys/logistic

Code:

logit Y i.sex i.education i.sector i.urban i.marital i.age_grp if working==1 [pw=weight], cluster(psu)

McFadden's pseudo-R^2 is above 0.2 and _hatsq is not significant suggesting good model fit and specification.

Thank you all very much once again!
Comment

Announcement

LR chi2 and Pseudo-R^2 - Enough to assess model fit?

Comment

Comment

Comment

Comment

Comment

Comment