Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • firthlogit questions on generating ROC and pseudo R2

    While I have a fair understanding of basic Stata and logit regressions, I have a quasi-separation issue. After researching the Internet including Statlist, I find some help information but need help in understanding how to operationalize the information.

    First question:

    In a response to a 14 March 2017 post on "Receiver operating curve after firthlogit command" stating "After I perform penalized maximum likelihood logistic regression with firthlogit, what is the easiest way to get the ROC for the model?
    (lroc command will not work)"

    Joseph Coveney responded:

    Code:
    firthlogit response c.predictor
    predict double xb, xb
    roctab response xb


    When using lroc, it is place immediate after the regression code on a separate line. For example:

    logit x y1 y2 y3
    lroc

    If someone could help me understand how to operationalize Joseph Coveney's code, it would be appreciated.

    Second question:

    For pseudo R2, I understand Stata uses McFadden's R2. For logit, I can recalculate the Stata result which is 1 - [log likelihood(full model) / log likelihood(intercept only model)]. Firthlogit generates a log likelihood(full model) but does not generate a log likelihood(intercept only model). Possible approach are use a regular logit log likelihood(intercept only model) or to use a firthlogit partial model with dependent and test variable. My feel is the regular logit log likelihood(intercept only model) is the most appropriate approach. Would appreciate any guidance offered.
    Last edited by Bob Sparger; 10 May 2023, 05:26.

  • #2
    Originally posted by Bob Sparger View Post
    I . . . need help in understanding how to operationalize the information.

    First question:

    . . . If someone could help me understand how to operationalize [the] code, it would be appreciated.
    I'm not sure what you mean here by "operationalize". Do you want to wrap the two postestimation commands into one? Do you want somehow to cajole -lroc- to work after -firthlogit-? (Neither of these possibilities seems worth the effort.)

    Second question:

    . . . Firthlogit generates a log likelihood(full model) but does not generate a log likelihood(intercept only model).
    That's not quite correct: -firthlogit- fits the model using penalized log-likelihood as the maximization criterion. I'm not trying to be pedantic; I suspect that it actually matters for your proposal:

    Possible approach are use a regular logit log likelihood(intercept only model) or to use a firthlogit partial model with dependent and test variable. My feel is the regular logit log likelihood(intercept only model) is the most appropriate approach.
    Doing that, you'd be comparing a penalized log-likelihood to an unpenalized log-likelihood, and I'm not sure how kosher that would be.

    But I agree that "a firthlogit partial model with dependent and test variable" wouldn't be an appropriate comparison model, and in lieu of that, perhaps you could consider something along the intercept-only approach illustrated below. I can't vouch for its properties as a pseudo-R2, but at least it'd be apples-to-apples.
    Code:
    firthlogit response c.predictor, nolog
    tempname full_ll
    scalar define `full_ll' = e(ll)
    
    constraint define 1 _b[predictor] = 0
    firthlogit response c.predictor, constraints(1) nolog
    
    display in smcl as text "{it:really} pseudo R² = " ///
        as result %06.4f 1 - `full_ll' / e(ll)
    What you're proposing I believe would be given by the following, which is illustrated in the SEMatch.do file found among the ancillary files in SSC's entry for -firthlogit-.
    Code:
    firthlogit response c.predictor, nolog
    
    tempname B
    matrix define `B' = e(b)
    
    logit response c.predictor, asis iterate(0) from(`B', copy) nolog
    // Read the pseudo R² directly in the output
    I'm not sure how much they would differ in the presence of complete separation or quasiseparation, but at least in the example below they don't seem to differ from each other by a lot under benign conditions.

    .ÿ
    .ÿversionÿ18.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿ//ÿseedem
    .ÿsetÿseedÿ758585087

    .ÿ
    .ÿquietlyÿsetÿobsÿ200

    .ÿgenerateÿdoubleÿpredictorÿ=ÿruniform()

    .ÿgenerateÿbyteÿresponseÿ=ÿrbinomial(1,ÿpredictor)

    .ÿ
    .ÿ*
    .ÿ*ÿApples-to-apples
    .ÿ*
    .ÿfirthlogitÿresponseÿc.predictor,ÿnolog

    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿÿÿÿ200
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(1)ÿÿ=ÿÿ42.65
    Penalizedÿlogÿlikelihoodÿ=ÿ-106.49309ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿ=ÿ0.0000

    ------------------------------------------------------------------------------
    ÿÿÿÿresponseÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿpredictorÿ|ÿÿÿ4.633721ÿÿÿ.7095129ÿÿÿÿÿ6.53ÿÿÿ0.000ÿÿÿÿÿ3.243102ÿÿÿÿ6.024341
    ÿÿÿÿÿÿÿ_consÿ|ÿÿ-2.269335ÿÿÿ.3882059ÿÿÿÿ-5.85ÿÿÿ0.000ÿÿÿÿ-3.030204ÿÿÿ-1.508465
    ------------------------------------------------------------------------------

    .ÿestimatesÿstoreÿFull

    .ÿtempnameÿfull_ll

    .ÿscalarÿdefineÿ`full_ll'ÿ=ÿe(ll)

    .ÿ
    .ÿconstraintÿdefineÿ1ÿ_b[predictor]ÿ=ÿ0

    .ÿfirthlogitÿresponseÿc.predictor,ÿconstraints(1)ÿnolog

    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿ200
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(0)ÿÿ=ÿÿÿ.
    Penalizedÿlogÿlikelihoodÿ=ÿ-135.99991ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿ=ÿÿÿ.

    ÿ(ÿ1)ÿÿ[xb]predictorÿ=ÿ0
    ------------------------------------------------------------------------------
    ÿÿÿÿresponseÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿpredictorÿ|ÿÿÿÿÿÿÿÿÿÿ0ÿÿ(omitted)
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0198044ÿÿÿ.1407264ÿÿÿÿÿ0.14ÿÿÿ0.888ÿÿÿÿ-.2560143ÿÿÿÿ.2956231
    ------------------------------------------------------------------------------

    .ÿdisplayÿinÿsmclÿasÿtextÿ"{it:really}ÿpseudoÿÿ=ÿ"ÿ///
    >ÿÿÿÿÿÿÿÿÿasÿresultÿ%06.4fÿ1ÿ-ÿ`full_ll'ÿ/ÿe(ll)
    reallyÿpseudoÿÿ=ÿ0.2170

    .ÿ
    .ÿ*
    .ÿ*ÿWhatÿyou'reÿproposingÿ(penalizedÿfullÿmodelÿversusÿunpenalizedÿnullÿmodel)
    .ÿ*
    .ÿestimatesÿrestoreÿFull
    (resultsÿFullÿareÿactiveÿnow)

    .ÿ
    .ÿtempnameÿB

    .ÿmatrixÿdefineÿ`B'ÿ=ÿe(b)

    .ÿ
    .ÿlogitÿresponseÿc.predictor,ÿasisÿiterate(0)ÿfrom(`B',ÿcopy)ÿnolog
    convergenceÿnotÿachieved

    LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿ=ÿÿÿÿ200
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(1)ÿÿÿÿ=ÿÿ59.97
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿ=ÿ0.0000
    Logÿlikelihoodÿ=ÿ-108.63467ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿ=ÿ0.2163

    ------------------------------------------------------------------------------
    ÿÿÿÿresponseÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
    -------------+----------------------------------------------------------------
    ÿÿÿpredictorÿ|ÿÿÿ4.633721ÿÿÿ.7114305ÿÿÿÿÿ6.51ÿÿÿ0.000ÿÿÿÿÿ3.239343ÿÿÿÿÿÿ6.0281
    ÿÿÿÿÿÿÿ_consÿ|ÿÿ-2.269335ÿÿÿ.3892962ÿÿÿÿ-5.83ÿÿÿ0.000ÿÿÿÿ-3.032341ÿÿÿ-1.506328
    ------------------------------------------------------------------------------
    Warning:ÿConvergenceÿnotÿachieved.

    .ÿ//ÿReadÿtheÿpseudoÿÿdirectlyÿinÿtheÿoutput
    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    To be honest, I'm not very up on the sundry pseudo-R2 methods for logistic regression models as I have never been able to make pracitcal use of them. After fitting a logistic regression model, pseudo-R2 to me is the least-interesting and first-ignored part of the output.

    Comment


    • #3
      @Jospeh Coveney, thank you. Very helpful.

      I stand correct on Question 2. I was thinking in logit terms where an intercept only model would be logit depvar, while as you suggest for firthlogit would be firthlogit depvar dummy (in your example predictor = 0) which is omitted due to collinearity. Your suggested solution works.

      Comment

      Working...
      X