Best model for binary dependent var and ordinal independent var?

Will Bryant

Join Date: Oct 2019

Posts: 4
#1

Best model for binary dependent var and ordinal independent var?

18 Oct 2019, 01:43

I have a dataset where I can see if individuals made a certain choice or not in a game. The variable "choice" is binary, coded 0/1 depending if they made that choice or not.

I also have a set of variables from a survey that was administered at the end of the game. These variables include gender and age, and also self-reported measures of risk, patience, and altruism. These are on a scale 1 to 10 (example: "on a scale 1 to 10, how willing are you to take risks?"). I would like to see if these measures explain the individual's likelihood of making that "choice" in the game.

In sum, I have a binary dependent variable that I'd like to regress on a set of individual characteristics (e.g. age) and ordinal independent variables (risk, patience, altruism).

What would be the best econometric model to use? Is ologit recommended (in this forum there seems to be mixed views on it :-) ).

Thank you!
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4433
#2

18 Oct 2019, 02:10

I don't know best, but maybe

Code:

logit choice i.sex c.(age risk patience altruism)

If the obtained numbers of any of the risk, patience and altruism variables is restricted (only a handful of the 10 available values are observed in the data), then you could make it categorical, too, using the factor variable notation.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#3

18 Oct 2019, 02:27

There are no mixed views on that, as this is not a matter of opinion, it is something you can easily check yourself: there is no difference between ologit and logit when your dependent/left-hand-side/explained/y-variable is binary.

Code:

. sysuse auto (1978 Automobile Data) . ologit foreign rep78, nolog Ordered logistic regression Number of obs = 69 LR chi2(1) = 29.37 Prob > chi2 = 0.0000 Log likelihood = -27.716037 Pseudo R2 = 0.3463 ------------------------------------------------------------------------------ foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- rep78 | 1.969267 .4785224 4.12 0.000 1.03138 2.907154 -------------+---------------------------------------------------------------- /cut1 | 8.043597 1.848757 4.4201 11.66709 ------------------------------------------------------------------------------ . logit foreign rep78, nolog Logistic regression Number of obs = 69 LR chi2(1) = 29.37 Prob > chi2 = 0.0000 Log likelihood = -27.716037 Pseudo R2 = 0.3463 ------------------------------------------------------------------------------ foreign | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- rep78 | 1.969267 .4785224 4.12 0.000 1.03138 2.907154 _cons | -8.043597 1.848757 -4.35 0.000 -11.66709 -4.4201 ------------------------------------------------------------------------------

(The constant in the logit model is negative the parameter of cut1, but that is just an identification constraint, it does not change the model).

So the choice between ordered and binary models is solely determined by the type of dependent variable. The fact that one or more of your independent variables is ordinal is completely irrelevant for that choice. So the answer to your question is logit.

This still leaves the underlying question of how to include your ordinal explanatory variable in your model. That is a complicated question, or the question is easy but the answer is complicated. The short and not very useful answer is: it depends.

You could add the variable as a nominal variable. That way you ensure that you do not treat all the distances between adjacent categories as equal. Downside is that with 10 values, you would add 9 indicator / dummy variables, and this likely cause trouble in a models for binary dependent variables. If this does work than as a way to display and interpret the results the contrast command with the ar. prefix is probably useful: It reorganizes the results such that the parameter of each indicator variable can be interpreted as a comparison with the previous category, which often makes more sense for a ordinal variable. This does not change the model, it only makes that model easier to interpret. Alternatively, you can add it as a continuous variable, that is more likely to work, but now you make the assumption that all the distances between adjacent categories are equal. You could collapse categories until you get a reasonable estimate, but now you loose information. So there are many ways to do so, all with their own advantages and disadvantages, and it depends on your data and the exact goal of your study which of these is most appropriate.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
2 likes
Comment

Announcement

Best model for binary dependent var and ordinal independent var?

Comment

Comment