Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predicted probabilities with categorial dependent variable

    I am running af oprobit model where my dependent variable is a categorial variable that can take the numbers 1-5. Where 1=excellent health, 2=very good, 3=good 4=fair and 5=poor

    I om running the oprobit on 9 dummy variables where a 0 means the respondent does have the suffering (could be high blood pressure) and 1 is that the respondent does NOT have the suffering.

    I make the predict just after running the oprobit. I then want to normalize the predictions between [0,1] so the respondents with the best health is closest to one


    I am not sure have to go about this when my dependent variable can take 5 values. Normally when having a binary variable as my dependent variable predict is the probability og the binary variable being =1 given the explanatory variables. But what do I do when it is a categorial variable as the dependent variable?

    This is what I do now
    Code:
    oprobit Self_reported_health i.High_blood_pressure1 i.Diabetes1 i.Cancer1 i.Lung_problems1 i.Heart_problems1 i.Stroke1 i.Psychological_problems1 i.Arthritis1 i.obese1
    
    predict p_Self_reported_health
    
    su p_Self_reported_health, meanonly 
    gen normal_p_Self_reported_health = (p_Self_reported_health - r(min)) / (r(max) - r(min))
    Is this the correct way?


  • #2
    See the section of the output of
    Code:
    help oprobit postestimation
    that discusses using the predict command after oprobit. Briefly, it seems likely you should be creating 5 prediction variables using the outcome() option to the predict command; these will give the predicted probabilities of each of the 5 outcomes. Right now, you are only obtaining the predicted probability of excellent health.

    Comment


    • #3
      Thank you for your answer William. This was also my own thought. What do you think I should do when I want to normalize the prediction/predictions between [0,1]. I need it so that when the normalization is closer to 1 the respondents has better health?

      i have tried this
      Code:
      predict p1 p2 p3 p4 p5, p
      
      su p1, meanonly 
      
      gen normal_p1 = (p1 - r(min)) / (r(max) - r(min))
      
      su p2, meanonly 
      
      gen normal_p2 = (p2 - r(min)) / (r(max) - r(min))
      
      su p3, meanonly 
      
      gen normal_p3 = (p3 - r(min)) / (r(max) - r(min))
      
      su p4, meanonly 
      
      gen normal_p4 = (p4 - r(min)) / (r(max) - r(min))
      
      su p4, meanonly 
      
      gen normal_p4 = (p4 - r(min)) / (r(max) - r(min))
      
      su p5, meanonly 
      
      gen normal_p5 = (p5 - r(min)) / (r(max) - r(min))
      
      
      
      egen average=rmean(normal_p1 normal_p2 normal_p3 normal_p4 normal_p5)
      
      
      
      su average, meanonly
      
      gen normal_average = (average - r(min)) / (r(max) - r(min))
      but not sure this is the way to go about it

      Comment


      • #4
        What you propose treats each of the five probabilities symmetrically, so it is hard for me to see how it would yield a result that indicates the overall health of a given individual.

        I don't see how to do what you want. My approach would be to predict the linear prediction xb and use it as your scale to assess overall health.

        I think that to have larger values of xb correspond to better health, you are going to need to reverse the order of coding for health, assigning 1 to poor health, etc.

        Comment

        Working...
        X