Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 100% sensitivity in my estat test, no data classified as negative & I'm only seeing 16/1771 case on my roc print out and scatterplot

    Good afternoon,

    When I run my logistic regression, everything comes out OK with no strange data. However, when I run the estat command, all my data is in the positive category. I'm very new to STATA and our class encourages us to copy paste from the tutor's .do file an instert our own varibales. I've tried researching, but I'm unsure if this is the output I'm supposed to get, or if there's an error in my code, as my Tutor's print out populates all four categories. Im using the following code for the dummy variables:

    *Q14 Social Media usage*
    gen SMuse =1 if Q14 == 4 | Q14== 5 | Q14 == 2 | Q14 == 3
    replace SMuse = 0 if Q14 == 1
    label define SMuse 1 "Use" 0 "Don't use"
    label values SMuse SMuse

    *Q22a Transfer Powers to or from the EU*
    gen NationalPower = 1 if Q22a == 2
    replace NationalPower = 0 if Q22a == 1 | Q22a == 97
    label define NationalPower 1 "Agree" 0 "Disagree"
    label values NationalPower NationalPower

    *Q23a EU Membership is Good or Bad*
    gen MembershipEU = 1 if Q23a == 2
    replace MembershipEU = 0 if Q23a == 1 | Q23a == 97
    label define MembershipEU 1 "Agree" 0 "Disagree"
    label values MembershipEU MembershipEU

    *Q28a Immigrants should adopt UK Values*

    gen UKValues = 1 if Q28a == 1
    replace UKValues = 0 if Q28a == 2 | Q28a == 97
    label define UKValues 1 "Agree" 0 "Disagree"
    label values UKValues UKValues

    *Q30a Immigrants are good for the UK Economy*
    gen ImmigrationEco = 1 if Q30a == 2
    replace ImmigrationEco = 0 if Q30a == 1 | Q30a == 97
    label define ImmigrationEco 1 "Agree" 0 "Disagree"
    label values ImmigrationEco ImmigrationEco

    logit NationalPower SMuse UKValues MembershipEU ImmigrationEco
    logit NationalPower SMuse UKValues MembershipEU ImmigrationEco, or

    estat classification
    lroc

    *** Generate predicted probabilities
    predict p

    *** Predict standardized residuals
    predict stdres, rstand

    *** Scatterplot of standardized residuals and predicted probabilities
    scatter stdres p, mlabel(NationalPower) yline(0)

    end

    The roc test shows only 16 cases despite it being recording 1771 as the number of observations. Additionally, my Agree/Dissagree variables overlap on the scatterplot. Below are the print outs:

    Click image for larger version

Name:	Screenshot 2020-12-06 at 13.10.14.png
Views:	1
Size:	81.4 KB
ID:	1584861
    Click image for larger version

Name:	Screenshot 2020-12-06 at 13.11.06.png
Views:	1
Size:	35.0 KB
ID:	1584862


    Click image for larger version

Name:	Screenshot 2020-12-06 at 13.12.08.png
Views:	1
Size:	47.2 KB
ID:	1584864


    Additionally when I start running diagnostics on the model, all my cases seem to overlap. If you require more information, I'll send it through ASAP.

    Thank you for your time. - Tom
    Attached Files

  • #2
    -estat classification-, by default, considers an outcome to be a predicted positive if the predicted probability exceeds 0.5. That's an arbitrary threshold, and as your graph shows, it's completely inappropriate for your data, where Pr(NationalPower) is always greater than 0.5. So you need to override that default by picking an appropriate cutoff to define positive prediction. -estat classification- has a -cutoff()- option that permits you to do that.

    Concerning your scatterplot, it seems that for every combination of predictors that yields any given predicted probability of NationalPower, there are both people who Agree and who Disagree on Q22a, so you get the picture you see. That's not really very surprising, since if any combination of predictors always led to disagree or always led to agree as the outcome, then its predicted probability would be equal to, or very close to, 0 or 1. When the predicted probability is not at the extreme ends of the 0-1 interval, you will usually see both Agree and Disagree outcomes at that level of predicted probability (except in very small samples or poorly fitting models).

    Comment

    Working...
    X