Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • logit model with very large odds ratio

    Hello!

    I'm running a logit regression, the DV action is a dummy variable, it =1 if a firm conducts a certain action and 0 otherwise. The IV L.return is a continuous variable for a firm's stock return, lagged at year t-1. I also have some control variables, some are continuous some are dummy, and they are all lagged at year t-1. I get a very large coefficient and odds ratio for L.return, I think this is probably because only 6.8% of the action dummy has value =1, most of the observations have action =0, so the data is extremely unbalanced (?). I wonder how can I work around this problem. Thanks a lot for any help!

    Code:
    ------------------------------------------------------------------------------------
                       |               Robust
                 action|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
                 return|
                   L1. |   3.348689   1.099598     3.05   0.002     1.193517    5.503862
                       |
    Code:
    ------------------------------------------------------------------------------------
                       |               Robust
                 action| Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------------+----------------------------------------------------------------
                 return|
                   L1. |    28.4654    31.3005     3.05   0.002     3.298662    245.6387

  • #2
    Recall that the interpretation of the regression is that each unit change in an independent variable is associated with, for example, 28.45 times higher odds of the firm doing whatever you modeled. Now, think about how your explanatory variable is scaled. I understand that stock returns are typically scaled in percents. Is your return variable coded in percentage points, or is it coded more like a fraction, e.g. a return of 100 percentage points is coded as 1?

    I think you can see where I’m going with this. If you coded it as 1 = a 100 percentage point return, then a one-unit change in the explanatory variable is a very large change, which is pretty rare. That’s one possible explanation.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Tell us something about the values that return takes. Run
      Code:
      summarize return, detail
      and copy the output into a new reply on this topic.

      Comment


      • #4
        Hi Weiwen and William,

        Thanks a lot for your quick reply. Here are the details for the return variable. If a firm's stock return is 1.2% then in my sample it is expressed as 0.012.

        Code:
        -------------------------------------------------------------
              Percentiles      Smallest
         1%      .025641              0
         5%     .0487013              0
        10%     .0589971              0       Obs              27,900
        25%     .0818505              0       Sum of Wgt.      27,900
        
        50%     .1084337                      Mean           .1325732
                                Largest       Std. Dev.      .0777148
        75%     .1666667       .4266667
        90%     .2477876       .4266667       Variance       .0060396
        95%     .2916667       .4266667       Skewness        1.32929
        99%     .3870968       .4266667       Kurtosis       4.612329

        Comment


        • #5
          My post crossed with post #2 by Weiwen Ng who correctly anticipates the results you showed us.

          The estimated odds ratio tells you that if the return is 1 the odds of the action being taken will be 28 times larger that it would be if the return is 0.

          The problem is that the units of your return are small relative to 1, so that the effect inferred from the coefficient is unrealistically large, because returns are never going to differ by 1.

          If you were to report return in percentage points - so .012 becomes 1.2 - then your coefficient estimate will be reduced to 0.03348689 (and the standard error and confidence interval similarly) and your odds ratio will change from e3.348689 = 28.46 to e0.03348689 = 1.03.

          Alternatively, you can learn about the margins command and use it to present a more meaningful interpretation of your results.

          Comment


          • #6
            I don't know how margins works with lagged variables. But anyway, William's point about using margins is a good one. If you consider margins, you want to remember that margins reports on the probability scale. Epidemiologists call this a risk difference. It is literally reporting a difference in probabilities, with probability scaled 0 to 1.

            The Stata forum's Richard Williams has a nice explainer here. The manual for margins is also good. Specific to your command (remember I don't know if the syntax works exactly with lagged variables), you might type something like:

            Code:
            margins, at(return = (0 (0.01) 0.4))
            marginsplot
            This means give me the probability of the event, setting the past-year returns at 0%, 1%, 2% ... 40%. And then plot them.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment


            • #7
              Hi William and Weiwen,

              Thanks a lot for your suggestions! After changing return to percentage points the results look more reasonable. I will try margins as well.

              Comment

              Working...
              X