Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Joseph Coveney View Post
    Along the lines of Richard's last suggestion, you could try the user-written (SSC) -firthlogit-. The penalization can help convergence. Be sure to center and re-scale your problematic predictor beforehand.
    Code:
    generate double new_predictor = (problematic_predictor - 304.1462) / 100
    Thanks, convergence is achieved when using -firthlogit-!

    Comment


    • #17
      It has to do with numerical precision. Huge numbers scaled very differently from other variables can cause problems for the computer. For MLE problems and potential solutions, see

      https://www3.nd.edu/~rwilliam/xsoc73994/L02.pdf

      Incidentally, I am not a big fan of other potential solutions. If something doesn't converge after 16,000 iterations, I would be pretty surprised if it converged on iteration 17,384. I usually kill a job that is still running after a few hundred iterations.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://academicweb.nd.edu/~rwilliam/

      Comment


      • #18
        Originally posted by Richard Williams View Post
        It has to do with numerical precision. Huge numbers scaled very differently from other variables can cause problems for the computer. For MLE problems and potential solutions, see

        https://www3.nd.edu/~rwilliam/xsoc73994/L02.pdf

        Incidentally, I am not a big fan of other potential solutions. If something doesn't converge after 16,000 iterations, I would be pretty surprised if it converged on iteration 17,384. I usually kill a job that is still running after a few hundred iterations.
        This has been true in my experience also.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment


        • #19
          Amy Duma and Joseph Coveney , with the risk of being a "a person who throws gloom over social enjoyment," this approach with the -firthlogit- is not a solution to the problem at hand.

          If we accept that running -firthlogit- or -probit-, instead of -logit- is a legitimate solution, I have even an easier solution which I guarantee will converge: just run a Linear Probability Model, that is, linear regression of the binary variable on the same regressors.

          I am not an expect on what exactly -firthlogit- does, but penalised likelihoood is not the same as maximum likelihood. And -firthlogit- gives different results from -logit- on problems where convergence is achieved by both. Here:

          Code:
          . webuse lbw, clear
          (Hosmer & Lemeshow data)
          
          . logit low age lwt i.race smoke ptl ht ui
          
          Iteration 0:   log likelihood =   -117.336  
          Iteration 1:   log likelihood = -101.28644  
          Iteration 2:   log likelihood = -100.72617  
          Iteration 3:   log likelihood =   -100.724  
          Iteration 4:   log likelihood =   -100.724  
          
          Logistic regression                             Number of obs     =        189
                                                          LR chi2(8)        =      33.22
                                                          Prob > chi2       =     0.0001
          Log likelihood =   -100.724                     Pseudo R2         =     0.1416
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0271003   .0364504    -0.74   0.457    -.0985418    .0443412
                   lwt |  -.0151508   .0069259    -2.19   0.029    -.0287253   -.0015763
                       |
                  race |
                black  |   1.262647   .5264101     2.40   0.016     .2309024    2.294392
                other  |   .8620792   .4391532     1.96   0.050     .0013548    1.722804
                       |
                 smoke |   .9233448   .4008266     2.30   0.021      .137739    1.708951
                   ptl |   .5418366    .346249     1.56   0.118     -.136799    1.220472
                    ht |   1.832518   .6916292     2.65   0.008     .4769494    3.188086
                    ui |   .7585135   .4593768     1.65   0.099    -.1418484    1.658875
                 _cons |   .4612239    1.20459     0.38   0.702    -1.899729    2.822176
          ------------------------------------------------------------------------------
          
          . firthlogit low age lwt i.race smoke ptl ht ui, nolog
          
                                                          Number of obs     =        189
                                                          Wald chi2(8)      =      24.76
          Penalized log likelihood = -85.625203           Prob > chi2       =     0.0017
          
          ------------------------------------------------------------------------------
                   low |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                   age |  -.0253433   .0353627    -0.72   0.474     -.094653    .0439664
                   lwt |  -.0138042   .0066473    -2.08   0.038    -.0268326   -.0007757
                       |
                  race |
                black  |   1.206882   .5059987     2.39   0.017     .2151434    2.198622
                other  |   .8185305   .4248464     1.93   0.054    -.0141531    1.651214
                       |
                 smoke |   .8782295   .3867022     2.27   0.023     .1203071    1.636152
                   ptl |   .5034914   .3318662     1.52   0.129    -.1469543    1.153937
                    ht |   1.711086   .6537438     2.62   0.009     .4297716      2.9924
                    ui |   .7361659    .443646     1.66   0.097    -.1333642    1.605696
                 _cons |   .3458818   1.165584     0.30   0.767    -1.938621    2.630385
          ------------------------------------------------------------------------------
          
          .
          Even in this example which I did not screen to select "large differences data" (this is the Stata manual example on -logit-), we see differences in the first digit after the decimal point.

          Comment


          • #20
            The firthlogit coefficients don't look all that different from the logit coefficients. Also, Paul Allison briefly discusses the penalized likelihood at

            https://statisticalhorizons.com/logi...or-rare-events

            Among other things, he says,

            Unlike exact logistic regression (another estimation method for small samples but one that can be very computationally intensive), penalized likelihood takes almost no additional computing time compared to conventional maximum likelihood. In fact, a case could be made for always using penalized likelihood rather than conventional maximum likelihood for logistic regression, regardless of the sample size. Does anyone have a counter-argument? If so, I’d like to hear it.
            I'd feel more comfortable with a regular logistic regression (see my suggestions above for other programs to try) but I could probably be ok with firthlogit. You could also compare the firthlogit estimates with the non-converged logit estimates to see if they seem reasonable.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://academicweb.nd.edu/~rwilliam/

            Comment


            • #21
              Originally posted by Richard Williams View Post
              You could also compare the firthlogit estimates with the non-converged logit estimates to see if they seem reasonable.
              It's a long shot, but another tack that Amy could look into is to use the converged estimates from firthlogit as starting values for logit. It would follow the same method as shown in the ancillary file SEMatch.do at firthlogit's SSC installation location—just omit the iterate(0) option shown there when feeding the coefficient vector to logit.

              Comment

              Working...
              X