Predicted probabilities with categorial dependent variable

Mads Funder Berg

Join Date: Feb 2021

Posts: 15
#1

Predicted probabilities with categorial dependent variable

22 Feb 2021, 07:20

I am running af oprobit model where my dependent variable is a categorial variable that can take the numbers 1-5. Where 1=excellent health, 2=very good, 3=good 4=fair and 5=poor

I om running the oprobit on 9 dummy variables where a 0 means the respondent does have the suffering (could be high blood pressure) and 1 is that the respondent does NOT have the suffering.

I make the predict just after running the oprobit. I then want to normalize the predictions between [0,1] so the respondents with the best health is closest to one

I am not sure have to go about this when my dependent variable can take 5 values. Normally when having a binary variable as my dependent variable predict is the probability og the binary variable being =1 given the explanatory variables. But what do I do when it is a categorial variable as the dependent variable?

This is what I do now

Code:

oprobit Self_reported_health i.High_blood_pressure1 i.Diabetes1 i.Cancer1 i.Lung_problems1 i.Heart_problems1 i.Stroke1 i.Psychological_problems1 i.Arthritis1 i.obese1 predict p_Self_reported_health su p_Self_reported_health, meanonly gen normal_p_Self_reported_health = (p_Self_reported_health - r(min)) / (r(max) - r(min))

Is this the correct way?
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

22 Feb 2021, 07:55

See the section of the output of

Code:

help oprobit postestimation

that discusses using the predict command after oprobit. Briefly, it seems likely you should be creating 5 prediction variables using the outcome() option to the predict command; these will give the predicted probabilities of each of the 5 outcomes. Right now, you are only obtaining the predicted probability of excellent health.
1 like
Comment

Mads Funder Berg

Join Date: Feb 2021
Posts: 15

22 Feb 2021, 08:06

Thank you for your answer William. This was also my own thought. What do you think I should do when I want to normalize the prediction/predictions between [0,1]. I need it so that when the normalization is closer to 1 the respondents has better health?

i have tried this

Code:

predict p1 p2 p3 p4 p5, p

su p1, meanonly 

gen normal_p1 = (p1 - r(min)) / (r(max) - r(min))

su p2, meanonly 

gen normal_p2 = (p2 - r(min)) / (r(max) - r(min))

su p3, meanonly 

gen normal_p3 = (p3 - r(min)) / (r(max) - r(min))

su p4, meanonly 

gen normal_p4 = (p4 - r(min)) / (r(max) - r(min))

su p4, meanonly 

gen normal_p4 = (p4 - r(min)) / (r(max) - r(min))

su p5, meanonly 

gen normal_p5 = (p5 - r(min)) / (r(max) - r(min))



egen average=rmean(normal_p1 normal_p2 normal_p3 normal_p4 normal_p5)



su average, meanonly

gen normal_average = (average - r(min)) / (r(max) - r(min))

but not sure this is the way to go about it

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

22 Feb 2021, 16:27

What you propose treats each of the five probabilities symmetrically, so it is hard for me to see how it would yield a result that indicates the overall health of a given individual.

I don't see how to do what you want. My approach would be to predict the linear prediction xb and use it as your scale to assess overall health.

I think that to have larger values of xb correspond to better health, you are going to need to reverse the order of coding for health, assigning 1 to poor health, etc.
Comment

Announcement

Predicted probabilities with categorial dependent variable

Comment

Comment

Comment