Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic postestimation for survey data

    Hello,
    I want to do a logistic regression with survey data so I am using the svy prefix. How do I get the specificity and sensitivity values for a logistic model within the svy prefix? Does estat class work. I can't see it in the manual as a command that is available for svy prefix.

    Thanks,
    James

  • #2
    As far as I know, -estat classification- is not supported by -svy:-. So I think you have to go back to the definitions of sensitivity and specificity. So here's an example:

    Code:
    webuse nhanes2f, clear
    
    svyset
    
    svy: logit heartatk age bpsystol tcresult
    
    predict prediction, pr
    gen byte predicted = (prediction > 0.1) & !missing(prediction)
    
    svy, subpop(if heartatk == 1): proportion predicted
    
    svy, subpop(if heartatk == 0): proportion predicted
    In the online nhanes2f data set we fit a simple logistic model of heartatk being predicted by age, bpsystol and tcresult. For illustration, we set a cutoff of predicted probability > 0.1 as a positive prediction. Sensitivity is then the probability of a positive prediction among the subpopulation who have heartatk== 1, and specificity is the probability of having a negative prediction among the subpopulation who have heartatk== 0. So if you run this code, you will see that the sensitivity comes out as 0.296, and the specificity to 0.924 (to three decimal places) using this 0.1 cutoff.

    Note that -estat classification-, by default, uses a cutoff of predicted probability > 0.5. That's arbitrary and, in real life, typically not very useful. I didn't use that to illustrate the technique here because in this case it gives a sensitivity of 0 and a specificity of 1, which is not particularly insightful! (This happens because the highest predicted probability in the data set is 0.228).

    Comment


    • #3
      Thank you Clyde. I will try that this week. So I guess I will examine an ROC curve to determine the best cutoff. Can you provide an example code for producing an ROC curve using the svy prefix?

      Comment


      • #4
        You just have to loop over threshold values of predicted probability, calculating the sensitivity and specificity for each one. To get a reasonably smooth curve, you should make the difference between successive thresholds pretty small, like maybe 0.02 or even 0.01. You can find the sensitivity and specificity in e(b) after each of the corresponding -svy: proportion- commands. Then calculate 1 - specificity as a new variable. Graph sensitivity vs that new variable. If you want the area under the ROC, you can get that with the -integ- command.

        Give it a try. If you get stuck, post back with the specific problem for more detailed assistance.

        Comment


        • #5
          It was much helpful. Though, I have a question about the prediction. I don't know why but my logit prediction generated numbers of missing data, thus deterring to produce of appropriate specificity and sensitivity. How can I handle missing values when I predict values?

          Comment

          Working...
          X