Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • logistic model - area under the curve, and c statistic

    Hello all,

    I have a query about area under the curve (lroc command) and C statistic (calculated using hl user written-program https://www.sealedenvelope.com/stata/hl/ )

    I have created a logistic model (picture 1).

    Using the 'lroc' command I get an area under the curve value of 0.6869 (picture 2) .

    Using the hl user-written program ( https://www.sealedenvelope.com/stata/hl/ ) I get (what I believe) is a C statistic of 0.5697 (picture 3).

    From what I have been reading, I believe that for logistic models both these values should be equal?

    #1. Am I correct with this belief?

    #2. And if so, could anyone please advise me why my two values are different.

    Thank you for you help

    PICTURE 1
    Click image for larger version

Name:	1. logistic model.png
Views:	1
Size:	11.3 KB
ID:	1391522

    PICTURE 2
    Click image for larger version

Name:	2. roc.png
Views:	1
Size:	44.3 KB
ID:	1391523

    PICTURE 3
    Click image for larger version

Name:	3. hl c statistic.png
Views:	2
Size:	15.6 KB
ID:	1391524

  • #2
    Yes, the area under the ROC curve and the C-statistic are the same thing. I am not familiar with the user-written program you are referring to, so I cannot comment why it gives a different result. The official Stata -lroc- program has been around for a very long time, so it would be surprising if it had an uncorrected error. I would be more inclined to believe the results of -lroc-. You might want to find the author of the user-written program and contact him/her about this.

    Comment


    • #3
      Hello Murph,

      I suspect Clyde couldn't open the pictures. Actually, they are very small in my screen ad I needed to enlarge them so as to get a nice view.

      This is surely the reason of not underlining that the value 0.5697 is in fact the result of the Hosmer-Lemeshow test.

      Indeed, the p-value in this case, being > 0.05, is good news in terms of calibration of the model.
      Last edited by Marcos Almeida; 08 May 2017, 09:57.
      Best regards,

      Marcos

      Comment


      • #4
        Thank you for your advice.

        Apologies for the sizing of the pictures.

        On the user-written program's website, they say that the program's output is the C statistic (P value 0.5697). [PICTURE 1]

        When I use "estat gof, group(10)" -> I get a P value of 0.3765. [PICTURE 2]

        I notice that the Hosmer-Lemeshow chi2 value is the same for both methods (8.61), however the user-written program uses 10 degrees of freedom, whilst estat gof uses 8. Which would explain the different P values.

        Perhaps I have misunderstood the explanation provided by the author of hl. I now realise this is not how a c-statistic would be calculated. Apologies for my confusion. I will ask them for clarification.

        Thank you

        [PICTURE 1]
        Click image for larger version

Name:	3. hl c statistic.png
Views:	2
Size:	15.6 KB
ID:	1391711



        [PICTURE 2]
        Click image for larger version

Name:	estat gof table.png
Views:	1
Size:	16.8 KB
ID:	1391714

        Last edited by Murph Ngo; 08 May 2017, 19:22.

        Comment


        • #5
          Thank you for presenting larger images. I gather the issue on the values is clarified. If in doubt, I'd stick to the - estat gof - results (dfs).
          Last edited by Marcos Almeida; 09 May 2017, 07:10.
          Best regards,

          Marcos

          Comment


          • #6
            Coming back to this with the benefit of the readable graphics, a quick summary.

            1. If you want the C-statistic, that is what -lroc- gives you.

            2. If you want the Hosmer-Lemeshow goodness-of-fit test, -estat gof- does that.

            3. If you are doing the Hosmer-Lemeshow test on the same data to which the logistic model was fit, the correct df is 8.

            4. If you are applying the test to a different, non-overlapping sample then the correct df is 10. You can get that by specifying the -outsample- option in the -estat, gof- command.

            Comment


            • #7
              Hi,
              I have a follow-up question regarding the C-statistics. I've been using -lroc- command following -logit- to calculate C-statistics. However, -lroc- provides area under ROC curve as point estimate. I wonder if there is a command or a method in STATA that can calculate the point estimate and 95% confidence interval of C-statistics?
              I did not think that it is necessary to have the CIs until I saw that several articles have reported C-statistics and its 95% confidence intervals:
              Moore, B.J., et al., Identifying Increased Risk of Readmission and In-hospital Mortality Using Hospital Administrative Data: The AHRQ Elixhauser Comorbidity Index. Med Care, 2017. 55(7): p. 698-705.
              Walraven, C.V., et al., A Modification of the Elixhauser Comorbidity Measures into a Point System for Hospital Death Using Administrative Data. Medical Care, 2009. 47(6): p. 626-633

              And these articles were using SAS (the %ROC macro from Gonen).
              Can STATA calculate C-statistics and its 95% confidence intervals? If yes how to do that?

              Any suggestions or comments are welcome. Thanks very much.

              Ginny



              Comment


              • #8
                Originally posted by Ginny Han View Post
                Can [Stata] calculate C-statistics and its 95% confidence intervals? If yes how to do that?
                Code:
                sysuse auto
                
                // One classification variable
                roctab foreign gear_ratio
                
                // Multiple classification variables in concert
                quietly logit foreign c.(gear_ratio displacement), nolog
                predict double xb, xb
                roctab foreign xb
                
                help roc

                Comment


                • #9
                  Thank you very much Mr.Coveney! Works perfectly.

                  Comment

                  Working...
                  X