Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Marginal effects vs separate regression by group levels

    I am running an melogit regression model
    Code:
    melogit dv c.x1 i.x2 i.x3 i.x4 || x5: || x6:
    What is different between the following lines of the code? Assume that x2 has two levels 1 and 2.

    Marginal effects
    Code:
    margins x2, dydx(x1)
    Separate regressions for each subgroup of x2.
    Code:
    melogit dv c.x1 i.x3 i.x4 if x2==1 || x5: || x6:
    melogit dv c.x1 i.x3 i.x4 if x2==2 || x5: || x6:
    Conceptually, what is the difference between the marginal effects for each level of x2, vs the coefficient that I obtain for x1 by running the regression separately for each level of x2. I am getting very different results using these approaches for my actual data.

  • #2
    The outputs of -margins- after -melogit- are probabilities, with a potential range from 0 to 1. The coefficients you get from the separate -melogit- regressions are not probabilities, they are log(odds ratio)s, which potentially range between negative and positive infinity. So they are entirely different metrics.

    Comment


    • #3
      Thanks for clarifying, Clyde.

      This is my current understanding: If the margins command returns a value of y when x2=1, it would mean that the outcome is y% more likely for a unit change in x1, when x2 takes the value 1.

      Lets say the melogit command for x2=1 returns a coefficient z for the variable x1. So the odds ratio for x1 is exp(z). This would mean that the output gets multiplied by the odds ratio for a unit change in x1.

      Then would exp(z) = 1 + y? Basically, should the odds ratio be equal to the marginal effects + 1. Or would this be an erroneous understanding?

      Comment


      • #4
        This would mean that the output gets multiplied by the odds ratio for a unit change in x1.
        I'm not sure what you mean by the "output" here. But if you mean y, then, no, this is false.

        What it means is that the odds (not the probability) of y gets multiplied by exp(z). What this means in terms of the probability of y depends on the value of y when x2 = 0. Let's consider two examples. In both examples we'll set z = 0.693 (chosen because it is, to three decimal places, ln 2). So exp(z) = 2. Now suppose that probability of y when x2 = 0 is 0.25. Then the odds of y = .25/(1-.25) = 0.333... Since exp(z) = 2, the odds of y when x2 = 1 will be 2*0.333.... = 0.666.... With odds of 0.666, the probability is 0.666.../(1+0.666...) = 0.4. So the difference in probability will be 0.4 - 0.25 = 0.15.

        Now, with the same z, suppose we have probability of y = 0.95 when x2 = 0. Then the odds of y is 0.95/(1-0.95) = 19. Accordingly, when x2 = 1, the odds of y will be 2 * 19 = 38. With odds = 38, the probability of y will be 38/(1+38) = 0.97. So in this case the difference in probability is 0.97 - 0.95 = 0.02.

        Then would exp(z) = 1 + y? Basically, should the odds ratio be equal to the marginal effects + 1. Or would this be an erroneous understanding?
        That would be erroneous. What is true is that if z is close to zero (certainly no farther from 0 then 0.1, and preferably even closer), exp(z) is approximately equal to 1 + z. But it's only a good approximation for z very close to 0. In the literature you will see this approximation used indiscriminately, when people will see a regression coefficient of, say, 0.4 and say that this corresponds to a multiplicative effect of 1.4, or they will call it a 40% increase. But that's way off: exp(0.4) = 1.49, which is a 49% increase. It's only a good approximation when the absolute value of z is less than 0.1.

        Comment


        • #5
          This is super useful (and informative). Thanks a lot, Clyde.

          Comment

          Working...
          X