Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Best model for binary dependent var and ordinal independent var?

    I have a dataset where I can see if individuals made a certain choice or not in a game. The variable "choice" is binary, coded 0/1 depending if they made that choice or not.

    I also have a set of variables from a survey that was administered at the end of the game. These variables include gender and age, and also self-reported measures of risk, patience, and altruism. These are on a scale 1 to 10 (example: "on a scale 1 to 10, how willing are you to take risks?"). I would like to see if these measures explain the individual's likelihood of making that "choice" in the game.

    In sum, I have a binary dependent variable that I'd like to regress on a set of individual characteristics (e.g. age) and ordinal independent variables (risk, patience, altruism).

    What would be the best econometric model to use? Is ologit recommended (in this forum there seems to be mixed views on it :-) ).

    Thank you!


  • #2
    I don't know best, but maybe
    Code:
    logit choice i.sex c.(age risk patience altruism)
    If the obtained numbers of any of the risk, patience and altruism variables is restricted (only a handful of the 10 available values are observed in the data), then you could make it categorical, too, using the factor variable notation.

    Comment


    • #3
      There are no mixed views on that, as this is not a matter of opinion, it is something you can easily check yourself: there is no difference between ologit and logit when your dependent/left-hand-side/explained/y-variable is binary.

      Code:
      . sysuse auto
      (1978 Automobile Data)
      
      . ologit foreign rep78, nolog
      
      Ordered logistic regression                     Number of obs     =         69
                                                      LR chi2(1)        =      29.37
                                                      Prob > chi2       =     0.0000
      Log likelihood = -27.716037                     Pseudo R2         =     0.3463
      
      ------------------------------------------------------------------------------
           foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             rep78 |   1.969267   .4785224     4.12   0.000      1.03138    2.907154
      -------------+----------------------------------------------------------------
             /cut1 |   8.043597   1.848757                        4.4201    11.66709
      ------------------------------------------------------------------------------
      
      . logit foreign rep78, nolog
      
      Logistic regression                             Number of obs     =         69
                                                      LR chi2(1)        =      29.37
                                                      Prob > chi2       =     0.0000
      Log likelihood = -27.716037                     Pseudo R2         =     0.3463
      
      ------------------------------------------------------------------------------
           foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             rep78 |   1.969267   .4785224     4.12   0.000      1.03138    2.907154
             _cons |  -8.043597   1.848757    -4.35   0.000    -11.66709     -4.4201
      ------------------------------------------------------------------------------
      (The constant in the logit model is negative the parameter of cut1, but that is just an identification constraint, it does not change the model).

      So the choice between ordered and binary models is solely determined by the type of dependent variable. The fact that one or more of your independent variables is ordinal is completely irrelevant for that choice. So the answer to your question is logit.

      This still leaves the underlying question of how to include your ordinal explanatory variable in your model. That is a complicated question, or the question is easy but the answer is complicated. The short and not very useful answer is: it depends.

      You could add the variable as a nominal variable. That way you ensure that you do not treat all the distances between adjacent categories as equal. Downside is that with 10 values, you would add 9 indicator / dummy variables, and this likely cause trouble in a models for binary dependent variables. If this does work than as a way to display and interpret the results the contrast command with the ar. prefix is probably useful: It reorganizes the results such that the parameter of each indicator variable can be interpreted as a comparison with the previous category, which often makes more sense for a ordinal variable. This does not change the model, it only makes that model easier to interpret. Alternatively, you can add it as a continuous variable, that is more likely to work, but now you make the assumption that all the distances between adjacent categories are equal. You could collapse categories until you get a reasonable estimate, but now you loose information. So there are many ways to do so, all with their own advantages and disadvantages, and it depends on your data and the exact goal of your study which of these is most appropriate.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment

      Working...
      X