Logit model (backed up) Warning

Fatima Alvi

Join Date: Jun 2014

Posts: 46
#1

Logit model (backed up) Warning

20 Mar 2015, 10:14

I am currently trying to run a logit for identifying the probability of being a tribal versus non-tribal on the subsample of adult women only. I have 5 rounds of data and my independent variable includes years of education, age, marital status, rural residence, children under 5, household size, age of household head AND monthly per capita income.
I have also included a set of zone dummies zone1-zone6 which divides the country into 6 unique zone-with each zone consisting of a group of states.

This is the command I use

Code:

foreach r in 1 2 3 4 5 { logit scst age yrs_ed rural marital n_mpce ch_under5 fhead headage heademp hhsize i.zone [aw=hhwt] /* */ if (keep_ageRC==1) & (sex==2) & (round==`r'), iterate (100) predict phatscst`r' if e(sample), pr sum scst [aw=hhwt] if (keep_ageRC==1) & (sex==2) & (round==`r') gen pbarscst`r'=r(mean) gen scstwt`r'= (phatscst`r'/(1-phatscst`r')) * ((1-pbarscst`r')/ pbarscst`r') * (hhwt) if scst==0 & round==`r' }

The issue I run into is that for rounds 1,2 and 3 the logit converges perfectly, however for rounds 4 and 5, after about 6 iterations, the model starts saying (backed up) like this

Code:

Iteration 0: log pseudolikelihood = -1.683e+08 Iteration 1: log pseudolikelihood = -1.557e+08 Iteration 2: log pseudolikelihood = -1.545e+08 Iteration 3: log pseudolikelihood = -1.545e+08 Iteration 4: log pseudolikelihood = -1.545e+08 (backed up) Iteration 5: log pseudolikelihood = -1.545e+08 (backed up) Iteration 6: log pseudolikelihood = -1.545e+08 (backed up) Iteration 7: log pseudolikelihood = -1.545e+08 (backed up) Iteration 8: log pseudolikelihood = -1.545e+08 (backed up) Iteration 9: log pseudolikelihood = -1.545e+08 (backed up) Iteration 10: log pseudolikelihood = -1.545e+08 (backed up) . . .<output omitted> Warning: Convergence not achieved

I tried introducing the variables sequentially and it seems the monthly per capita income (n_mpce) is the problem variable. I cannot remove it from my model since it's the only measure of socioeconomic status I have. I tried converting it to daily per capita income to reduce the scale- that doesn't seem to work either.

My question is

1. What does stata mean by (backed up)- how is it different from (not concave) warning.
2. If I restrict the iterations to 100 and broadly get the results I want (for reweighting)- how big of an issue is this non-convergence of the model.
3. Why is the non-convergence happening in round 4 and 5 only and not in the earlier rounds- the unit of measurement of MPCE has been same in all rounds.
Tags: None
Fatima Alvi

Join Date: Jun 2014

Posts: 46
#2

20 Mar 2015, 12:41

Also posted at : http://stackoverflow.com/questions/2...ked-up-warning
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#3

20 Mar 2015, 13:11

See the -maximize- section in [R] for more information about the "backed up" message, specifically pages 1264-5.

That said, it's a little bit odd to have this kind of problem arise in a simple logistic regression model, and I would wonder if there isn't something peculiar about the data in rounds 4 and 5. I think I would check the distribution of n_mpce in the round 4 and round 5 estimation samples, both marginal and conditional on the scst outcome. Is there perhaps something approaching complete separation? Is the variance of n_mpce unusually high or low? And, if the latter, are their implausible values that are likely to be data errors?

All of this may or may not solve your problem. These things are difficult to diagnosis and fix at a distance. I hope it does help you.
Comment
Fatima Alvi

Join Date: Jun 2014

Posts: 46
#4

20 Mar 2015, 13:48

Thanks!

I checked and it seems that the extreme values of n_mpce are in fact the problem. Less than 0.1% of the (weighted) sample has MPCE more than 10000. If in my logit command I add an additional condition of

Code:

logit y x1 x2 if age>15 &.....& n_mpce<10000

Then the model has no problem.

I hope censoring the sample in such a way isn't 'wrong' so to speak. Overall I expect it to have little affect on my means (which is what I am interested in).
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#5

20 Mar 2015, 16:44

Censoring isn't "wrong." Well, actually, the word censoring is wrong. You are actually truncating by excluding observations beyond a limit--to censor would be to treat observations greater than that limit as being at that limit.

Semantics and terminology aside, you need to think about some things in this truncation situation.

1. n_mpce >= 10000 sounds, to me, like a good candidate for being incorrect data. There aren't many currency units where you will find anyone other than a Wall Street financier with a daily income that large. Even if you're referring to the original scaling of monthly, it's still pretty high (though not implausible). So I'd be eager to find out whether those observations are data errors. If they are, ideally one could replace them with corrected values and leave the analysis alone. If corrected values can't be found, you are at least on solid grounds excluding them (though the possibility exists that the erroneous values occurred not at random....) Fortunately you have very few such observations so it shouldn't matter much.

2. If those values are actually correct, then you are on somewhat shaky ground excluding them. At a minimum, whatever conclusions you draw from your model have applicability restricted to the subpopulation with n_mpce < 10000. Since this only excludes a small fraction of your total population that's not so terrible--but still one has to wonder if a universally applicable model might not be found.

3. So there may be other ways to deal with this. Have you considered log-transforming n_mpce, or cube-rooting it if there are zeroes (or some other transformation)? You might well get an even better fitting model that way, and lead to unrestricted conclusions too.

.

Last edited by Clyde Schechter; 20 Mar 2015, 16:46.
1 like
Comment
Fatima Alvi

Join Date: Jun 2014

Posts: 46
#6

21 Mar 2015, 22:22

This is quite helpful!!

I decided to go with the log-transformation. It seemed to me the missing and zero MPCE values were more likely to be data errors than the higher values. In any case I lost fewer observations by log transforming MPCE than by truncating the distribution. The model didn't choke and I got results almost exactly the same as I was with MPCE<10000.

Thank you for the helpful comments.
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#7

26 Apr 2016, 10:02

Hi
Happy to find this post. I am experiencing the same problem. one of my variable ranges from -100 to 600. if I limit this variable from -100 to 150 then it generates reulsts, otherwise states kept showing "Iteration 26: log pseudolikelihood = -1572.3012 (backed up)". It can do log-transformation because the variable contain a lot of negative values and zeros. I am wondering if I should indicate in my paper that I have to exclude some observations in order to run the logit regression , something like this? Many thanks.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#8

26 Apr 2016, 11:03

If you do end up excluding observations, you definitely have to report that fact in your paper and explain why. Otherwise you are misrepresenting your model. No exceptions, no excuses.

That said, have you tried using a cube root transformation? That will narrow the range of the data and doesn't require excluding any observations because every real number has a real cube root. Excluding observations from data is usually a last resort and often has serious consequences for validity. You should exhaust all other possibilities first.
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#9

26 Apr 2016, 12:00

Originally posted by Clyde Schechter View Post

If you do end up excluding observations, you definitely have to report that fact in your paper and explain why. Otherwise you are misrepresenting your model. No exceptions, no excuses.

That said, have you tried using a cube root transformation? That will narrow the range of the data and doesn't require excluding any observations because every real number has a real cube root. Excluding observations from data is usually a last resort and often has serious consequences for validity. You should exhaust all other possibilities first.

hi
Thanks for the reply.

sure I will report that, but I don't know how to explain this situation...should I call it as " the data caused econometrics errors?"

I did't try to use cube root, because I haven't seen any paper in my filed who used this method. But I will consider it once I need that method.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#10

26 Apr 2016, 12:12

No, the data didn't cause econometric errors. You just describe what happened: with the full range of data, the maximum likelihood estimation of the model failed to converge.

Even if cube root transformations are not widely used in your field, my intuition, assuming that you get convergent estimates using it, is that readers and reviewers will find it more acceptable than excluding observations with extreme values.
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#11

26 Apr 2016, 13:10

Originally posted by Clyde Schechter View Post

No, the data didn't cause econometric errors. You just describe what happened: with the full range of data, the maximum likelihood estimation of the model failed to converge.

Even if cube root transformations are not widely used in your field, my intuition, assuming that you get convergent estimates using it, is that readers and reviewers will find it more acceptable than excluding observations with extreme values.

Thank you very much.I will have a look.
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#12

26 Apr 2016, 13:11

Originally posted by Steve Samuels

Nick Cox wrote a Stata Journal Tip about the cube root transformation. He remarks that

yes, i see that. But because i am still a PhD student, so i think i have to follow what other paper done.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#13

26 Apr 2016, 13:34

Chen is referring to a post that I deleted by mistake. I repeat it below, with additions. As a past advisor and member of many Ph,D. committees, I disagree with your notion that you "have to" follow what others have done. Check with your advisor!

Nick Cox wrote a Stata Journal Tip about the cube root transformation. He remarks that

The cube root does an excellent job with a distinctly nonnormal distribution

You can refer to his article in your publication. I found it to be the best transformation for an analysis I did years ago (Cross et al, 1986). The only disadvantage of the cube root transformation is that to use the at() option in margins, you'll need to transform "nice" income values to their cube roots.See also Nick's expository help file about transformations (ssc install transint).

Reference:

Nicholas Cox (2011) Stata tip 96: Cube roots, The Stata Journal 11, Number 1, pp. 149–154.
http://www.stata-journal.com/sjpdf.h...iclenum=st0223

Nicholas Cross (1986), H Lambert, and S Samuels. Sperm binding activity of the zona pellucida of immature mouse oocytes. Cell Biol Int Rep 10, no. 7: 545-554.

Last edited by Steve Samuels; 26 Apr 2016, 13:44.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Chen Huang

Join Date: Jan 2016

Posts: 33
#14

26 Apr 2016, 14:10

Originally posted by Steve Samuels View Post

Chen is referring to a post that I deleted by mistake. I repeat it below, with additions. As a past advisor and member of many Ph,D. committees, I disagree with your notion that you "have to" follow what others have done. Check with your advisor!

Nick Cox wrote a Stata Journal Tip about the cube root transformation. He remarks that You can refer to his article in your publication. I found it to be the best transformation for an analysis I did years ago (Cross et al, 1986). The only disadvantage of the cube root transformation is that to use the at() option in margins, you'll need to transform "nice" income values to their cube roots.See also Nick's expository help file about transformations (ssc install transint).

Reference:

Nicholas Cox (2011) Stata tip 96: Cube roots, The Stata Journal 11, Number 1, pp. 149–154.
http://www.stata-journal.com/sjpdf.h...iclenum=st0223

Nicholas Cross (1986), H Lambert, and S Samuels. Sperm binding activity of the zona pellucida of immature mouse oocytes. Cell Biol Int Rep 10, no. 7: 545-554.

Hi,
Thank you very much. you are right, I think i am too conservative I think think about that carefully.
Comment

Announcement

Logit model (backed up) Warning

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment