Stata dropping variables that predict success perfectly

Katherine Picho

Join Date: Apr 2014

Posts: 32
#1

Stata dropping variables that predict success perfectly

12 Jun 2014, 06:24

Dear Statalist

I am currently using Stata 12.

I have a set of dichotomous variables that I'm using to predict a categorical outcome in logistic regression. However, one of the independent variables (BAND where 1 = yes, 0= no) has no observations for the category 0. Accordingly, stata provides the following message:

TR_BAND != 0 predicts success perfectly
TR_BAND dropped and 8 obs not used

I understand why this is happening i.e., the model cant be fitted because the coefficient for BAND is negative infinity (since the dependent variable doesn't vary within the Band = 0 category).

So effectively, Stata's solution is to drop that variable and all observations where Band =1,

My question is:

1. Should this model be fitted at all (i.e., should I proceed to analyze the results of this model that stata modified by dropping the foregoing variable?)

2. Would it make sense/ be defensible, alternatively, to remove the problematic variable (BAND) from the model apriori-- before running the logistic regression? (in this case, we would be preserving the observations and hence sample size).

3. A reviewer has pointed out that since the variable is excluded from the model, the logistic regression analysis should not even be performed at all. Is he correct? Should the model not be fitted at all? OR if he is incorrect, how do I assuage his concerns about interpreting a model where the variable BAND has been excluded?

thank you!

Katherine Picho

Last edited by Katherine Picho; 12 Jun 2014, 06:25. Reason: Adding my name to the post
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35698
#2

12 Jun 2014, 06:42

What you report under #3 from a reviewer is hard to follow and harder to swallow. The fact that a single predictor can't be included in a model has nothing to do with whether the other predictors define a sensible model. If someone by accident included gender as a binary predictor for some obstetric response variable, that would be thrown out because the gender of all the mothers is constant, namely female. But that would be a "Yes, of course!" mistake over including that predictor and nothing more than that. I am not sure how you know that the reviewer is male.
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4994
#3

12 Jun 2014, 07:12

If your sample isn't very big, sometimes -exlogistic- can be used in these cases. The help says "exlogistic is an alternative to logistic, the standard maximum-likelihood-based logistic regression estimator; see [R] logistic. exlogistic produces more-accurate inference in small samples because it does not depend on asymptotic results and exlogistic can better deal with one-way causation, such as the case where all females are observed to have a positive outcome."

Whatever you do, I certainly would not abandon the entire analysis!

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4994
#4

12 Jun 2014, 07:25

firthlogit (available from SSC) might also be used in such cases. Allison discusses it briefly at http://www.statisticalhorizons.com/l...or-rare-events.

Some other references:

http://www.ats.ucla.edu/stat/mult_pk...git_models.htm

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Katherine Picho

Join Date: Apr 2014

Posts: 32
#5

13 Jun 2014, 08:48

Thank you Drs. Coz & Williams!

I tried exlogistic but ran into computational & memory issues....so I tried firthlogit and it worked like a charm. Thank you also for the references as I've been able to read up more on this issue....& I've definitely learned something new!!
Comment
Erick Turner

Join Date: Feb 2018

Posts: 13
#6

17 Feb 2018, 17:33

I am running Stata 11.2 (but also have version 9.2) on Mac OS 10.13. In my dataset, there are 134 cases with 1 binary dependent variable and 2 binary independent variables. Using "plain" logistic regression (logit command), I, too, got the error message: "varname != 0 predicts success perfectly", after which it dropped half my observations. Having seen these posts from 2014, I tried exlogit, and it ran without error messages or dropped observations. However, I'm intrigued by firthlogit. Have not tried it yet due to issues covered in another post, but wondering...if it, too, runs without generating an error message, should one of these two approaches should be regarded as more trustworthy than the other? Thanks.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#7

17 Feb 2018, 18:05

-firthlogit- also tolerates perfect prediction because, like -exlogistic-, it does not compute maximum likelihood estimates. (Maximum likelihood estimates are infinitely large when the data exhibit perfect prediction.) -firthlogit- uses penalized maximum likelihood estimation, which allows the estimate to be finite even when perfect prediction obtains.

Analogy: -exlogistic- is to -logistic- as Fisher exact test is to Pearson chi square test in a 2x2 contingency table. This is true both in the sense of what they calculate, and in that -exlogistic- is very computation and memory intensive compared to -logistic-. If your data set is large, or if the number of predictors for which you require coefficient estimates is large, the computational burden may be unsupportable. -firthlogit- does not have these limitations. Bear in mind that although -logistic-, -exlogistic- and -firthlogit- are all applied to the same underlying logistic regression model, they use different estimators. So even when all three run without a hitch, they will not produce identical results. With small samples -logistic- estimates can be biased, a problem which -exlogistic- and -firthlogit- do not face.

As for your other post about availability of -firthlogit- for early versions of Stata, Joseph Coveney, the author of -firthlogit- is an active member of this Forum. He is located in Japan, I believe, and tends to be active when it is daytime there. I haven't noticed if he comes here on weekends or not. In any case, I imagine he will respond to your question when he next comes on the Forum, unless somebody else provides an answer first.
1 like
Comment
Erick Turner

Join Date: Feb 2018

Posts: 13
#8

17 Feb 2018, 19:02

Thanks much. Given the scenario in which -exlogit- and -firthlogit- both run without a hitch, can you think of any a priori rationale for favoring one over the other?
Comment
Erick Turner

Join Date: Feb 2018

Posts: 13
#9

19 Feb 2018, 17:18

Appreciate Richard Williams's online post https://www3.nd.edu/~rwilliam/stats3/RareEvents.pdf:

"Note that Leitgöb’s results are consistent with Allison’s belief that the firthlogit method is best."

So my takeaway is that, all other things being equal, it's preferred over -exlogistic-.

Another consideration is replicability: Even if -exlogistic- works fine in one's (relatively small) dataset, it might well blow up when someone tries to do a similar study with a larger N. That would not be an issue with -firthlogistic-.
Comment
Mohamed Elsayed

Join Date: Feb 2017

Posts: 29
#10

12 Mar 2018, 17:13

Hi Clyde Schechter

I read your last response and run -firthlogit- on a sample suffers perfect prediction. It works, however, I noticed that the results output doesn't provide Pseudo R-squared. Would please tell me how can I get Pseudo R-squared?

Also, Is there any way to employ -outreg2- to get the results in out-source like excel or stata data editor?

Thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#11

12 Mar 2018, 20:11

I'm afraid I don't know the answer to either question. (And I don't actually understand the second one.)

I'm not a devotee of pseudo-R-squared, and I don't remember how it is calculated. But to the best I recall, it is calculated from the log likelihood statistic. Firthlogit, using penalized maximum likelihood might not even produce an suitable likelihood statistic for this purpose--I don't know. The author of -firthlogit-, Joseph Coveney, is an active member of this forum and perhaps he will comment on this.

As for -outreg2-, I don't use it and don't know enough about it to advise others how to. That said, I don't understand what you mean by "get the results in out-source." Also the conjunction of Excel and Stata data editor is odd, as one is a software application and the other is a window in Stata. So I really don't even know where your going with this, let alone whether or how -outreg2- will get you there.
Comment
Mohamed Elsayed

Join Date: Feb 2017

Posts: 29
#12

13 Mar 2018, 05:40

Thank you Clyde. I hope to get answer regarding Psedue R Square from Joseph. In this context, if getting Psedue R Square is not possible with -firthlogit- , can I run the model using -logit- to get Psedue R Square, and rerun it again using -firthlogit- to get coefficients? Therefore I won't lose observations wich happens with -logit- and obtain Psedue R Square which is not existing in --firthlogit output?

My second question, do -lrtest- for likelihood ratio test and -test- for walad x²test work with -firthlogit- similar to -logit-?

Sorry for typos as I am typing from mobile.

Thank you.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#13

13 Mar 2018, 07:04

I've never paid attention to any of the various pseudo-R²s, and I don't know whether any of them can be computed from what -firthlogit- delivers.

I don't use or know anything about -outreg2- and so can't tell you whether it can be used with -firthlogit-, sorry.

As for likelihood-ratio and Wald tests with -firthlogit-, read the online help that comes with it. Short answer for the former: yes, likelihood-ratio testing can be done with -firthlogit-, but, no, it's not done the same way that it's done with -logit-. Short answer for the latter: don't do it if you're using -firthlogit- for the same reason that most people find themselves having to resort to it.
Comment
Mohamed Elsayed

Join Date: Feb 2017

Posts: 29
#14

13 Mar 2018, 08:11

Hi Joseph Coveney

Thank you very much for your explanations. I am little confused due to your last sentence. I think you mean I should not use -firthlogit- if the reason is Stata dropping observations because of perfect prediction. Briefly, I want to use - firthlogit - because when I wanted to include year fixed effects in my -logit- model, Stata found 2 years perfectly predict and thus drop their observations. I do not want to lose observations as some referees would see this as a problem. Therefore, I read everywhere about perfect prediction problem in -logit- and some sources suggest - firthlogit - as a powerful solution (e.g., https://stats.idre.ucla.edu/stata/da...ic-regression/ ; https://stats.idre.ucla.edu/other/mu...eal-with-them/ AND this post ).

My main questions, in my case:
1- Is it a good solution to use - firthlogit - to overcome the problem I face when I include year fixed effects in my -logit- model, and Stata drops 2 years perfectly predict and thus drop their observations?

2- Because I should report pseudo R² , can I run - firthlogit - to estimate coefficients and rerun the model again using -logit- just to obtain the pseudo R² , as both doing equivalent work,?

3- If 1 and 2 are incorrect, is Stata dropping of 2 years (i.year) perfectly predict and thus drop their observations when I include year fixed effects in -logit- model a problem? Or can it be justifiable and model fit won't be affected?

I also appreciate the insightful opinion, clarifications, and suggestions of Clyde Schechter in responding on these questions.

Thanks in advance.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#15

13 Mar 2018, 08:40

2- Because I should report pseudo R2 , can I run - firthlogit - to estimate coefficients and rerun the model again using -logit- just to obtain the pseudo R2 , as both doing equivalent work,?

Doing what you suggest will be incorrect. Because McFadden's Pseudo $R^{2}$ simply gives you the improvement in the log-likelihood resulting from the addition of variables to the model, it can be computed for any command that outputs a log-likelihood. You can do this from the formula

$$\text{McFadden's Pseudo}\; R^{2}= 1 - \frac{L_{1}}{L_{0}}$$

where $L_{1}$ is the maximized log-likelihood for the model and $L_{0}$ is the maximized log-likelihood using the same model but with only an intercept. Firthlogit does not allow one to estimate a model with no regressors, but you can create a variable whose values are constant which will be omitted because of collinearity thus resulting in a model with only an intercept.

Code:

webuse hiv1 gen k=1 qui firthlogit hiv cd4 cd8 scalar l1= e(ll) qui firthlogit hiv k scalar l0= e(ll) scalar McFadden_R2= 1-(l1/l0) di McFadden_R2
Comment

Announcement

Stata dropping variables that predict success perfectly

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment