Marginal effects and interactions

Claire McKenna

Join Date: Feb 2022

Posts: 83
#1

Marginal effects and interactions

30 Apr 2023, 10:33

I’m running a logit regression with five independent variables, each representing an interaction between a time period of ~6 months (1; 0 == all other year-months represented in the other interactions) and status on a policy variable of interest (whether strict (1) or not (0)). Rather than use Stata’s # command for the interactions in the logit regression command, since it ends up dropping variables due to collinearity, I’ve generated each of the interactions myself, separately (e.g., var1==time1 * strict, etc.). Those interaction variables are:

Code:

var1==time1*strict var2==time1*not strict var3==time2*strict var4==time2*not strict var5==time3*strict var6==time3*not strict

In my regression, I omit var1. So it’s

Code:

logit outcome var2 var3 var4 var5 var6 …

My question is about the margins output that follows--do I interpret the marginal effects results (below) relative to the omitted category, var1? If not, is there a way to have the margins commands compare to var1? Thanks so much.

Code:

margins, dydx(i.var2) margins, dydx(i.var3) margins, dydx(i.var4) margins, dydx(i.var5) margins, dydx(i.var6)

Last edited by Claire McKenna; 30 Apr 2023, 10:35.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#2

30 Apr 2023, 11:54

Stata does not omit colinear variables out of some quirk. It does so because linear algebra requires it. You cannot violate the laws of linear algebra, and any attempt to disguise what you are doing and coax Stata to keep them will always fail. If something you are doing in Stata is producing a colinearity that you think shouldn't be there, then either there is a problem with your data that is causing some variables that shouldn't be colinear to be so, or there is a subtle relationship among your variables based on how they are defined that you are unaware of. In the first case you need to fix your data set, and in the second case you need to refine your understanding of your data. I have been using Stata since 1994 (version 4), and I have never known Stata to get this wrong.

Using the approach you describe, you will not be able to use -margins- at all, because in the logistic regression you are not using factor variable notation. So Stata will not know what you are talking about when it sees i.var2, etc. And if you correct that by adding the i. prefixes, then -margins- will run but give you incorrect results because it will not know about the relationships that exist among these variables.

You can get part of what you want with:

Code:

logistic outcome i.time#i.strict

where time is a variable coded 1, 2, or 3, and strict is coded 0 or 1. The logistic output will have 5 levels for the interaction: all the combinations except 1.time#0.strict. It will look like the output from your proposed equation, just labeled a little differently.

After that, you can get the average marginal effects of time and of strict if those are interesting to you with

Code:

margins, dydx(time) margins, dydx(strict)

You cannot get a marginal effect of an interaction term, because there is no such thing. Mathematically, it does not exist. You can "trick Stata" into giving you the marginal effects of var2, var3, etc. in your equation. But, as I have already pointed out, because -margins- does not understand the relationships among those variables, the results it produces are wrong and meaningless, they are an illusion.

Why do interactions not have marginal effects? The marginal effect of a discrete explanatory variable is defined as the expected outcome difference associated with a unit change in that explanatory variable. Now, what is a unit change in, say, 2.time#1.strict? It could be a change from time = 1 (or 3) and strict = 1 to time = 2 and strict = 1. Or it could be a change from time = 2 and strict = 0 to time = 2 and strict = 1. Any of those three possibilities produces a unit change of 2.time#1.strict from 0 to 1. But those three ways of changing 2.time#1.strict will, in general, have different associated changes in the outcome. In fact, the only situation in which those three things will not have different changes in the outcome is if the outcome does not depend on time or strict! So the "marginal effect" of 2.time#1.strict is ill-defined because it could be any of three different values. (With continuous variables the situation is infinitely worse.) The same reasoning just applied to 2.time#1.strict applies similarly to any of the interactions.

Last edited by Clyde Schechter; 30 Apr 2023, 11:57.
Comment
Claire McKenna

Join Date: Feb 2022

Posts: 83
#3

30 Apr 2023, 12:19

Thank you very much, Clyde. I'm not sure why I didn't think to generate a multi-category time variable to begin with.

What you suggest works. So if I follow "logit outcome i.time#i.strict" with margins (time), dydx(i.strict), it should give me the average percentage point change associated with a change from less strict (0) to strict (1), in each of the three time periods of interest. Is this correct?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#4

30 Apr 2023, 13:17

Yes, that's absolutely correct.
Comment
Claire McKenna

Join Date: Feb 2022

Posts: 83
#5

30 Apr 2023, 13:18

Many thanks.
Comment
Rosa Blau

Join Date: May 2017

Posts: 85
#6

18 Jun 2023, 03:13

Originally posted by Clyde Schechter View Post

You cannot get a marginal effect of an interaction term, because there is no such thing. Mathematically, it does not exist. You can "trick Stata" into giving you the marginal effects of var2, var3, etc. in your equation. But, as I have already pointed out, because -margins- does not understand the relationships among those variables, the results it produces are wrong and meaningless, they are an illusion.

Why do interactions not have marginal effects? The marginal effect of a discrete explanatory variable is defined as the expected outcome difference associated with a unit change in that explanatory variable. Now, what is a unit change in, say, 2.time#1.strict? It could be a change from time = 1 (or 3) and strict = 1 to time = 2 and strict = 1. Or it could be a change from time = 2 and strict = 0 to time = 2 and strict = 1. Any of those three possibilities produces a unit change of 2.time#1.strict from 0 to 1. But those three ways of changing 2.time#1.strict will, in general, have different associated changes in the outcome. In fact, the only situation in which those three things will not have different changes in the outcome is if the outcome does not depend on time or strict! So the "marginal effect" of 2.time#1.strict is ill-defined because it could be any of three different values. (With continuous variables the situation is infinitely worse.) The same reasoning just applied to 2.time#1.strict applies similarly to any of the interactions.

Dear Dr. Schechter, this is extremely helpful, thank you. Do I understand correctly that a key problem in calculating / interpreting marginal effects for interactions is that the unit / margin change is not clear? If so, does the second difference test, and displaying the specific contrasts as proposed by Mize (2019) get around this problem effectively?

I ask because, no surprise, my theoretical premise requires an interaction effect but my variables are all categorical. I was about to calculate and interpret adjusted probabilities as suggested by Williams (2012), but then Prof. Williams himself in a different Statalist post pointed me towards Mize's (2019) work. I read Mize's (2019) paper after reading your post above, and while I found it persuasive, your words "you cannot get a marginal effect of an interaction term" still ring in my ear.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#7

18 Jun 2023, 09:56

I think the key problem in calculating marginal effects in interaction models is that the subject of interaction models is, in general, poorly taught. It is very common for people to completely misunderstand what the various terms in an interaction model actually represent. And it is also very common for people to extrapolate from the existence of marginal effects for uninteracted variables to thinking that marginal effects should also exist for interaction terms, when they do not.

The Mize paper is a nice one, and you won't go wrong if you follow the advice therein. But it has nothing to do with the non-existence of marginal effects for interaction terms. Nothing in that paper contradicts my warning. Mize is dealing with the fact that in non-linear models that contain interactions, the coefficient of the interaction term is not the way to describe an interaction effect (n.b., interaction effect, not marginal effect of the interaction term.) Interaction effects exist. The interaction effect is the difference in the marginal effect of one variable associated with different values of the other variable(s) with which it is interacted. In linear models interaction effects are properly estimated by the coefficient of the interaction term. In non-linear models the coefficient of the interaction term is not an appropriate estimator for the interaction effect.

By contrast, marginal effects of interaction terms do not exist in any model. The reason is straightforward: a marginal effect of a predictor is the difference in outcome associated with a change in that predictor when everything else is held constant. But by definition, an interaction term can never change at all if everything else is held constant: an interaction term is a composite of other terms, and changes in the interaction term can only arise from changes in the terms of which it is a composite. This is a shorter way of saying what I said in the material you quoted in your post.
Comment
Rosa Blau

Join Date: May 2017

Posts: 85
#8

19 Jun 2023, 02:41

Again thank you very much for taking the time to write this out. I now need to let it sink in and re-read the Mize paper, and make sure I've understood / made the distinctions you articulated between the interaction effect versus (non-existent) marginal effect of interactions.
Comment

Claire McKenna

Join Date: Feb 2022
Posts: 83

13 Aug 2023, 14:01

Hi. I wanted to check that I’m interpreting my updated analysis correctly. I’m running a logit regression where the independent variable is an interaction between a policy index and a time-period variable. The index ranges from 0 to 3, with 0 indicating that a state has 0 strict policies in place, 1 indicating that a state has 1 strict policy in place, and so on and so forth. The time period variable ranges from 1 to 4, with each category or step representing roughly 6 months. Sample code and output are below.

Clyde Schechter, am I interpreting the output correctly? For example…

The average percentage point change in my outcome associated with a change from 0 strict policies to 1 strict policy in time period 1 is 0.836.
The average percentage point change associated with a change from 0 strict policies to 2 strict policies, in time period 1 is 1.43.
The average percentage point change associated with a change from 0 strict policies to 3 strict policies, in time period 1 is 3.02.
And so on and so forth. The base category is a state with 0 (no) strict policies. As always, thank you for your help.

Code:

logit outcome i.time1#i.index controls…

margins (time1), dydx(i.index) 

Average marginal effects                                Number of obs = 56,762
Model VCE: Robust

Expression: Pr(reemp), predict()
dy/dx wrt:  1.index 2.index 3.index

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
0.index      |  (base outcome)
-------------+----------------------------------------------------------------
1.index      |
       time1 |
          1  |   .0083612    .009891     0.85   0.398    -.0110248    .0277473
          2  |   .0030888   .0132426     0.23   0.816    -.0228663    .0290439
          3  |   .0085533   .0089151     0.96   0.337    -.0089199    .0260266
          4  |   .0008308   .0223726     0.04   0.970    -.0430188    .0446803
-------------+----------------------------------------------------------------
2.index      |
       time1 |
          1  |   .0143268   .0124855     1.15   0.251    -.0101443     .038798
          2  |   .0195058   .0124638     1.56   0.118    -.0049229    .0439344
          3  |    .039888   .0130511     3.06   0.002     .0143083    .0654678
          4  |    .014882   .0191958     0.78   0.438    -.0227411    .0525051
-------------+----------------------------------------------------------------
3.index      |
       time1 |
          1  |    .030202   .0117005     2.58   0.010     .0072694    .0531345
          2  |   .0087232   .0163303     0.53   0.593    -.0232835    .0407299
          3  |   .0373631    .010378     3.60   0.000     .0170227    .0577035
          4  |   .0129602   .0225287     0.58   0.565    -.0311952    .0571157
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#10

13 Aug 2023, 14:24

Yes, I agree with your interpretation.
1 like
Comment
Claire McKenna

Join Date: Feb 2022

Posts: 83
#11

13 Aug 2023, 15:07

Thanks, Clyde
Comment

Announcement