Logit model convergence not achieved when adding non-linearities

Amy Duma

Join Date: Feb 2018

Posts: 13
#16

14 Jan 2019, 08:46

Originally posted by Joseph Coveney View Post

Along the lines of Richard's last suggestion, you could try the user-written (SSC) -firthlogit-. The penalization can help convergence. Be sure to center and re-scale your problematic predictor beforehand.

Code:

generate double new_predictor = (problematic_predictor - 304.1462) / 100

Thanks, convergence is achieved when using -firthlogit-!
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5011
#17

14 Jan 2019, 08:47

It has to do with numerical precision. Huge numbers scaled very differently from other variables can cause problems for the computer. For MLE problems and potential solutions, see

https://www3.nd.edu/~rwilliam/xsoc73994/L02.pdf

Incidentally, I am not a big fan of other potential solutions. If something doesn't converge after 16,000 iterations, I would be pretty surprised if it converged on iteration 17,384. I usually kill a job that is still running after a few hundred iterations.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
1 like
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#18

14 Jan 2019, 09:13

Originally posted by Richard Williams View Post

It has to do with numerical precision. Huge numbers scaled very differently from other variables can cause problems for the computer. For MLE problems and potential solutions, see

https://www3.nd.edu/~rwilliam/xsoc73994/L02.pdf

Incidentally, I am not a big fan of other potential solutions. If something doesn't converge after 16,000 iterations, I would be pretty surprised if it converged on iteration 17,384. I usually kill a job that is still running after a few hundred iterations.

This has been true in my experience also.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3050

#19

14 Jan 2019, 09:34

Amy Duma and Joseph Coveney , with the risk of being a "a person who throws gloom over social enjoyment," this approach with the -firthlogit- is not a solution to the problem at hand.

If we accept that running -firthlogit- or -probit-, instead of -logit- is a legitimate solution, I have even an easier solution which I guarantee will converge: just run a Linear Probability Model, that is, linear regression of the binary variable on the same regressors.

I am not an expect on what exactly -firthlogit- does, but penalised likelihoood is not the same as maximum likelihood. And -firthlogit- gives different results from -logit- on problems where convergence is achieved by both. Here:

Code:

. webuse lbw, clear
(Hosmer & Lemeshow data)

. logit low age lwt i.race smoke ptl ht ui

Iteration 0:   log likelihood =   -117.336  
Iteration 1:   log likelihood = -101.28644  
Iteration 2:   log likelihood = -100.72617  
Iteration 3:   log likelihood =   -100.724  
Iteration 4:   log likelihood =   -100.724  

Logistic regression                             Number of obs     =        189
                                                LR chi2(8)        =      33.22
                                                Prob > chi2       =     0.0001
Log likelihood =   -100.724                     Pseudo R2         =     0.1416

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0271003   .0364504    -0.74   0.457    -.0985418    .0443412
         lwt |  -.0151508   .0069259    -2.19   0.029    -.0287253   -.0015763
             |
        race |
      black  |   1.262647   .5264101     2.40   0.016     .2309024    2.294392
      other  |   .8620792   .4391532     1.96   0.050     .0013548    1.722804
             |
       smoke |   .9233448   .4008266     2.30   0.021      .137739    1.708951
         ptl |   .5418366    .346249     1.56   0.118     -.136799    1.220472
          ht |   1.832518   .6916292     2.65   0.008     .4769494    3.188086
          ui |   .7585135   .4593768     1.65   0.099    -.1418484    1.658875
       _cons |   .4612239    1.20459     0.38   0.702    -1.899729    2.822176
------------------------------------------------------------------------------

. firthlogit low age lwt i.race smoke ptl ht ui, nolog

                                                Number of obs     =        189
                                                Wald chi2(8)      =      24.76
Penalized log likelihood = -85.625203           Prob > chi2       =     0.0017

------------------------------------------------------------------------------
         low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0253433   .0353627    -0.72   0.474     -.094653    .0439664
         lwt |  -.0138042   .0066473    -2.08   0.038    -.0268326   -.0007757
             |
        race |
      black  |   1.206882   .5059987     2.39   0.017     .2151434    2.198622
      other  |   .8185305   .4248464     1.93   0.054    -.0141531    1.651214
             |
       smoke |   .8782295   .3867022     2.27   0.023     .1203071    1.636152
         ptl |   .5034914   .3318662     1.52   0.129    -.1469543    1.153937
          ht |   1.711086   .6537438     2.62   0.009     .4297716      2.9924
          ui |   .7361659    .443646     1.66   0.097    -.1333642    1.605696
       _cons |   .3458818   1.165584     0.30   0.767    -1.938621    2.630385
------------------------------------------------------------------------------

.

Even in this example which I did not screen to select "large differences data" (this is the Stata manual example on -logit-), we see differences in the first digit after the decimal point.

Comment

Richard Williams

Join Date: Apr 2014

Posts: 5011
#20

14 Jan 2019, 11:01

The firthlogit coefficients don't look all that different from the logit coefficients. Also, Paul Allison briefly discusses the penalized likelihood at

https://statisticalhorizons.com/logi...or-rare-events

Among other things, he says,

Unlike exact logistic regression (another estimation method for small samples but one that can be very computationally intensive), penalized likelihood takes almost no additional computing time compared to conventional maximum likelihood. In fact, a case could be made for always using penalized likelihood rather than conventional maximum likelihood for logistic regression, regardless of the sample size. Does anyone have a counter-argument? If so, I’d like to hear it.

I'd feel more comfortable with a regular logistic regression (see my suggestions above for other programs to try) but I could probably be ok with firthlogit. You could also compare the firthlogit estimates with the non-converged logit estimates to see if they seem reasonable.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#21

14 Jan 2019, 17:35

Originally posted by Richard Williams View Post

You could also compare the firthlogit estimates with the non-converged logit estimates to see if they seem reasonable.

It's a long shot, but another tack that Amy could look into is to use the converged estimates from firthlogit as starting values for logit. It would follow the same method as shown in the ancillary file SEMatch.do at firthlogit's SSC installation location—just omit the iterate(0) option shown there when feeding the coefficient vector to logit.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment