logistic regressions - interpreting odds ratios and margins

Robbie Hopper

Join Date: Jun 2020

Posts: 18
#1

logistic regressions - interpreting odds ratios and margins

09 Jul 2020, 01:42

Dear Statalist

I have run a simple logistic regression using a child labour survey in an East African country.

I ran the number of households that reported begging against those households that were child headed.

The logistic regression showed that child headed households had an odds ratio of 2.51. The constant was 0.54.I interpret this as meaning that child headed households are 2.5 times more likely to beg that non-child headed households (although I am not sure if i need to subtract the constant?)

However the margins command shows the predicted probability of begging in a child-headed household is 0.57 while in a non-child headed household it is 0.35. The odds ratio and margins do not speak to each other as I understand. How can the odds of begging in a child headed household be 2.5 times greater using the odds ratio but the probability of begging in a child-headed household be far less at 1.6 times greater?

Apologies for the no doubt obvious question, but I am struggling to find any answers.

. logistic begging i.child_head

Logistic regression Number of obs = 1,288
LR chi2(1) = 3.95
Prob > chi2 = 0.0470
Log likelihood = -836.89941 Pseudo R2 = 0.0024

------------------------------------------------------------------------------
begging | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.child_head | 2.51981 1.180173 1.97 0.048 1.006238 6.310081
_cons | .545676 .032052 -10.31 0.000 .4863365 .6122557
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

.
end of do-file

. margin child_head

Adjusted predictions Number of obs = 1,288
Model VCE : OIM

Expression : Pr(begging), predict()

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
child_head |
0 | .3530339 .0134158 26.31 0.000 .3267393 .3793285
1 | .5789474 .113269 5.11 0.000 .3569443 .8009505
------------------------------------------------------------------------------
Tags: None

Ivan Privalko

Join Date: Aug 2015
Posts: 53

09 Jul 2020, 01:55

Hey Robbie,

This is a really interesting survey and a great question. The important thing to think about here is the difference between absolute and relative rates. This is a common issue for students of all types of models. Your first finding is that the households headed by children are more likely to beg, when compared to households headed by adults. This is a relative rate, if we give adult headed households a value of 1, than households headed by children are 2.5 times more likely to beg.

Code:


. logistic begging i.child_head

Logistic regression Number of obs = 1,288
LR chi2(1) = 3.95
Prob > chi2 = 0.0470
Log likelihood = -836.89941 Pseudo R2 = 0.0024

------------------------------------------------------------------------------
begging | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.child_head | 2.51981 1.180173 1.97 0.048 1.006238 6.310081
_cons | .545676 .032052 -10.31 0.000 .4863365 .6122557
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

.

The margins command gives you the absolute rate of begging, or the predicted probability of begging, relative to the controls in your model.

This is slightly different. If you run margins, your output is saying that households headed by adults have a predicted probability of 35% of begging. Those who are headed by children have a PP of 57% to engage in begging.

Code:

. margin child_head

Adjusted predictions Number of obs = 1,288
Model VCE : OIM

Expression : Pr(begging), predict()

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
child_head |
0 | .3530339 .0134158 26.31 0.000 .3267393 .3793285
1 | .5789474 .113269 5.11 0.000 .3569443 .8009505
------------------------------------------------------------------------------

You're looking at the same output from a different angle, I would say.

Comment

Robbie Hopper

Join Date: Jun 2020

Posts: 18
#3

09 Jul 2020, 02:05

Thanks Ivan, I appreciate your quick response.

However, I am still slightly confused. If we say that child headed households are 2.5 times more likely to beg than adult headed households, then how can the probability of them begging be only 1.6 times greater than adult headed households (0.58 compared to 0.35)?

I believe my understanding of odds ratio and probabilities is obviously wrong, but I assume if the odds ratios are relative like you said, then comparing the absolute rates of begging for child headed and adult headed households using the margins command should yield the same results. The margin output for child headed households should be 2.5 times greater than the output for adults. However it is only 1.6 times greater.

Robbie
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#4

09 Jul 2020, 02:34

Originally posted by Robbie Hopper View Post

I believe my understanding of odds ratio and probabilities is obviously wrong

Yes.

Odds are p / (1 - p).

So, the ratio of two odds—an odds ratio—would be

. display in smcl as text 0.58 / (1 - 0.58) / (0.35 / (1 - 0.35))
2.5646259
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3458
#5

09 Jul 2020, 02:37

An odds is related but not the same as a probability. They are related in the sense that both are ways of quantifying how likely an event (success) is. The odds is the average number of success per failure. The probability is the average number of successes per trial. The ratio you computed from the output of margins is a risk ratio (a ratio of probabilities), while the odds ratio you get from the output of logit is a ratio of odds.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Robbie Hopper

Join Date: Jun 2020

Posts: 18
#6

09 Jul 2020, 02:46

Thank you Maarten and Joseph.

Maartern (or Joseph or anyone else for that matter), as I am trying to compare the likelihood of child headed households begging compred to adult headed households, which is more intuitive for my interpretation, the odds ratio or probability?

My work is mainly on looking at households characteristics and understanding how certain household characteristic impact on a range of adverse outcomes, such as begging, child labour, etc.

Does odd ratio or probabilities lend itself better to thess interpretation? Or how would you interpret the results of the above (my original post) using odds ratio or probabilities?

Thanks
Comment
Robbie Hopper

Join Date: Jun 2020

Posts: 18
#7

09 Jul 2020, 03:37

To rephrase my questions in a better way...

I am looking at various negative outcomes for children (child labour, begging, etc.) and comparing them with a range of households characteristics (child headed, large households, elderly headed households etc)

The results show some interesting findings. For instance, regression the binary dependent variable of 'begging' or not give me the odds ratio of 2.5 for child headed households. For the same regression, the margins commands shows probabilities of 0.57 for child headed households and 0.35 for adult headed households.

I am at a loss as the best way to interpret these findings in a manner which makes sense for the data...

Is it more intuitive to say child headed households are 2.5 times more likely to beg than adult headed households (or is this not correct interpretation), or is the better interpretation that child headed households are 1.6 times more likely to beg than adult headed households (0.57 v 0.35).

Can anyone assist?
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3458
#8

09 Jul 2020, 04:01

Especially, with just a single categorical variable, there is no difference between these two; they are just different ways of saying the same thing, each with their own strengths and weaknesses. What is intuitive depends on the person. Many people think probabilities are more intuitive (but it is well known that people don't really understand probabilities), but if you go to a horse track people will tell you that odds are obviously more intuitive. I would start with working to really understand what each of these are, just go through the math, go through examples, and really understand they are just different representations of the same thing. After that, you can look at your intended audience, and make a decision on what is easiest for them.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Ivan Privalko

Join Date: Aug 2015

Posts: 53
#9

09 Jul 2020, 04:18

Hi Robbie, Sorry I was wrong in my original comment. When looking at odds ratios, a value greater than one is positive, meaning that the odds are greater, but this doesn't mean they are 2.5 times "more likely to beg". It only means that there is an association in terms of odds.
Comment
Robbie Hopper

Join Date: Jun 2020

Posts: 18
#10

10 Jul 2020, 04:03

Thanks all for help on this

My final question relates to the p values shown for margins and logistic regressions. On occasion, the p value on the logistic regression can be highly insignificant but on a margins command it becomes significant.

I assume it is the former than matters, but can you explain why this is the case, and clarify if the predicted probabilities p values are valid and should be used for interpretation.

Thanks
Comment
Ivan Privalko

Join Date: Aug 2015

Posts: 53
#11

10 Jul 2020, 04:25

Hi Robbie,
This kind of gets back to the absolute versus relative rates again.

In the margins command, the p-value is for a test of whether the predicted mean significantly differs from 0 (not the value of the other households, just 0). I'm quoting that from Michael Mitchell's Interpreting and Visualizing Regression Models Using Stata. He has a section on the contrast command and how margins can be used to compare predicted probabilities, which is I think, what you're after. Check out "contrast".

Last edited by Ivan Privalko; 10 Jul 2020, 04:27.
Comment

Announcement