Interpreting results of marginal effects for ordered response using margins command

Jim Walker

Join Date: May 2016

Posts: 8
#1

Interpreting results of marginal effects for ordered response using margins command

16 May 2016, 19:36

Hi all, just a quick question.

My dependent variable has three ordered responses (0=healthiest, 1 ,2=unhealthiest) and my predictor variables are also ordered categories (except gender). I ran oprobit before using the margins command as follows:

margins, dydx(*) predict (outcome(0))

Some results:
For outcome==0 (healthiest)
Female: .018
Rich: .067

For outcome==2 (unhealthiest)
Female: -.007
Rich: -.026

Based on the STATA guide which gave examples for logistic regression, I'm going to guess this can be interpreted as follows:

Overall, being female instead of male, and being rich instead of poor (baseline) means you are 1.8% and 6.7% more likely to belong to the "healthiest" category

Similarly, being female and being rich means you are .7% and 2.6% less likely to belong to the "unhealthiest" category.

Question 1: Are my interpretations correct?

Question 2: Would you advise using "atmeans" in this case? (seems a bit odd since I have categorical predictors)

Question 3: How do I interpret results for outcome==1? For example, Rich = -.04. So would it be just like the others (Being rich means you are 4% less likely to belong to the moderate category)

Thanks very much in advance!

Last edited by Jim Walker; 16 May 2016, 19:42.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

16 May 2016, 20:50

Overall, being female instead of male, and being rich instead of poor (baseline) means you are 1.8% and 6.7% more likely to belong to the "healthiest" category

Similarly, being female and being rich means you are .7% and 2.6% less likely to belong to the "unhealthiest" category.

I think you have the right idea in mind, but your expression of it is off the mark. Being female instead of male additively increases your probability of being in the healthiest outcome classification by 1.8 percentage points. Similarly for all of the others. I imagine that is what you meant. But what you said is actually rather different.

Suppose that in the base category, male, the probability of being in the healthiest category is 35%, just as an example. Then being female is associated with a probability of 35% + 1.8 percentage points = 36.8%. However, what you said, would imply that females would have a probability of 35%*1.018 = 35.63%. The -margins- output gives you additive percentage points, not multiplicative percents.

Finally, since you show us neither the code you used nor the full output, my interpretations here are based on some assumptions about what you did. In particular, these interpretations are probably wrong if your model contains any interaction terms or quadratic terms involving any of the variables you have mentioned.

For the future, when asking for interpretation of output, it is always better to show the actual code you ran and the complete output. The details are often quite important.
Comment
Jim Walker

Join Date: May 2016

Posts: 8
#3

16 May 2016, 21:22

There are no interaction or quadratic terms thank you very much for your correction, that is very helpful!

Here's my code:
oprobit status i.education i.wealth i.sex
margins, dydx(*) predict (outcome(0))
margins, dydx(*) predict (outcome(1))
margins, dydx(*) predict (outcome(2))

Response variable:
Status (0=healthiest...2=unhealthiest)

Predictor variables:
education (0=none...4=higher)
wealth (1=poorest... 5=richest)
sex (0=male, 1=female)

Should I use atmeans in the margins command?

Is there additional code I should run if I want to properly examine the predictor's relationship with the response?

Last edited by Jim Walker; 16 May 2016, 21:25.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#4

16 May 2016, 21:53

With discrete predictors, taking the mean is not a very meaningful option, and using the -atmeans- option in your margins command would give you results that do not apply to any observation in your data set, and generally not to any possible combination of values that could ever be observed in reality. So I generally tend to reserve the -atmeans- option for use with continuous variables.

I generally like to also get the predicted values at all combinations of the variables. So you might want to run:

Code:

margins education wealth sex

Note: If you are using the current version of Stata (14.1), following -oprobit- the default statistics for -margins- are the predicted probabilities for all levels of the outcome variable. This was not always the case in earlier versions of Stata, so to get them all you might have to call -margins- repeatedly with -predict(outcome(0))-, predict(outcome(1))-, etc. as you showed above. But those specifications are not needed in the current version.
1 like
Comment
Jim Walker

Join Date: May 2016

Posts: 8
#5

17 May 2016, 12:48

Yes I'm on STATA 13.1 when I run exactly the code you suggest it shows me the output for outcome(0). I get an error when I try to show outcomes:

margins education wealth sex predict (outcome(0))

or other outcomes. It says "variable predict not found"

Is there something I'm missing?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#6

17 May 2016, 12:54

Yes, you are missing a comma between sex and predict.

Also, remove the blank space between predict and (outcome(0))--it may or may not cause problems, but it often does.
1 like
Comment
Jim Walker

Join Date: May 2016

Posts: 8
#7

17 May 2016, 13:10

Ah lovely! That worked indeed (the extra space wasn't a problem for me, just the comma)

These percentages all make sense, for example:

Outcome==0 (healthy)
wealth |
poorest .9001082
poorer .9045627
middle .9275537
richer .955038
richest .966814

In other words, the probability is higher that you belong to the healthy category as you get richer.

But what exactly does the .966 mean? Like, what does it assume the other predictor variables are when it makes this predicted probability?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#8

17 May 2016, 13:26

It assumes that the other predictor variables are exactly what they are in the data set. So, looking at the output you show in #7, Stata calculates these by going through each observation and calculating the predicted probability for each using the actual values of all of the model variables, except it substitutes poorest for the wealth variable. It then averages those predicted probabilities and comes up with 0.9001082. Then it goes back and does it again, this time with the wealth variable set to poorer, but all other variables kept at their actual variables. This time the average predicted probability is .9045627. Rinse and repeat.

To put it succinctly, these are average predicted probabilities at each level of wealth adjusted to the observed distribution of the other model variables.
1 like
Comment

Richard Williams

Join Date: Apr 2014
Posts: 4983

17 May 2016, 13:26

Given that you are condemned to using the hopelessly antiquated Stata 13, consider using the mtable command that is part of Long & Freese's spost13 suite (findit spost13_ado). It is far easier than having to give separate commands for each outcome. For example,

Code:

. webuse nhanes2f, clear

. ologit health weight age, nolog

Ordered logistic regression                     Number of obs     =     10,335
                                                LR chi2(2)        =    1448.89
                                                Prob > chi2       =     0.0000
Log likelihood = -15039.951                     Pseudo R2         =     0.0460

------------------------------------------------------------------------------
      health |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0034246   .0011553    -2.96   0.003    -.0056889   -.0011603
         age |  -.0399811   .0010854   -36.83   0.000    -.0421085   -.0378537
-------------+----------------------------------------------------------------
       /cut1 |  -4.914335    .108239                     -5.126479    -4.70219
       /cut2 |  -3.449735   .1021272                     -3.649901   -3.249569
       /cut3 |  -2.049077    .098365                     -2.241869   -1.856285
       /cut4 |  -.8142637   .0968314                      -1.00405   -.6244777
------------------------------------------------------------------------------

. mtable, dydx(*) dec(5)

Expression: Marginal effect of Pr(health), predict(outcome())

           |     poor      fair   average      good  excellent
 ----------+--------------------------------------------------
    weight |  0.00022   0.00035   0.00019  -0.00020   -0.00056
       age |  0.00257   0.00411   0.00223  -0.00238   -0.00653

Specified values where .n indicates no values specified with at()

           |  No at()
 ----------+---------
   Current |       .n

For more examples, see the appendices of http://www3.nd.edu/~rwilliam/xsoc73994/Ologit01.pdf.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Jim Walker

Join Date: May 2016

Posts: 8
#10

17 May 2016, 18:00

Clyde and Richard, thanks so much for your help. I'm sorry to bother again, but I was doing further reading on the web about calculating marginal effects, specifically this page from SAS:

http://support.sas.com/rnd/app/examples/ets/margeff/

Almost halfway down the page is a paragraph starting with "There might be cases where some regressors are dummy variables." - in post #3 I note that none of my predictor variables are continuous, they're mostly ordinal. Is it really correct for me to use margins, dydx(*) predict (outcome(0)) command? They say a derivative calculation may not be meaningful and suggest doing this instead:

As in, they suggest calculating the change in predicted probability that occurs when a predictor variable changes by one level, so for example when my observation is female instead of male, or "poorer" instead of "poorest". What do you guys think about this?

Last edited by Jim Walker; 17 May 2016, 18:15.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#11

17 May 2016, 19:28

When you specify discrete variables (using the i. prefix in your regression model), -margins- calculates the marginal effect as the difference between the expected probability when x = 1 and the expected probability when x = 0. While the -margins- command still uses the notation dydx, no derivative calculations are made for this. Derivative calculations are done by -margins- only when calculating the marginal effects of continuous variables.
Comment
Jim Walker

Join Date: May 2016

Posts: 8
#12

17 May 2016, 20:58

Ah, so it automatically knows what to do. Fantastic!

Alright, very final question. Is there a simply way I can run a quick check for homoscedasticity? My whole day of googling only brought me here:

http://web.uvic.ca/~dgiles/downloads...ice/index.html

But I can't tell if those program files are even for STATA or not.

Last edited by Jim Walker; 17 May 2016, 21:40.
Comment
Jim Walker

Join Date: May 2016

Posts: 8
#13

18 May 2016, 09:02

No need to respond to my last post, I found oglm and used that.
Comment

Announcement

Interpreting results of marginal effects for ordered response using margins command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment