Interpreting Interaction Term effect in OLS regression

Greg Oliver

Join Date: Nov 2021

Posts: 2
#1

Interpreting Interaction Term effect in OLS regression

20 Nov 2021, 13:14

Hello everyone,
I am a beginner with Stata so please bear with me.

I'm having a hard time understanding the effects of my interaction term on my main explanatory variables. For context, one of my hypotheses assumes the relationship between social media use (IV) and an individual's inclination to participate in non-traditional political participation (DV created by indexing boycott, protest, and petition) will be stronger for citizens situated on the left of the political spectrum. To determine this, I created an interaction term combining Social Media and Ideology by running Social*Ideology.

1. When I ran regress with my interaction term, my interaction term becomes statistically significant, the original Social Media variable becomes statistically insignificant (and the coefficient becomes negative) and the original Ideology variable stays statistically significant. How can I interpret this? Am I even approaching this the right way?

2. When I add the interaction term, I now fail the Omitted Variable Test when running estat ovtest. Is there any way to remedy this, or is this something that's expected to happen with an interaction term?

The first model is without the interaction term, while the second model is with the interaction term.

Thank you so much in advance, any help is extremely appreciated!

Last edited by Greg Oliver; 20 Nov 2021, 13:22.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30059
#2

20 Nov 2021, 14:08

There is no problem with your analyses, assuming that you have correctly calculated the interaction variable as the product of SocialMedia and Ideology. You just need to understand what they mean and how they relate (or, more importantly, do not relate) to each other.

From your code I cannot discern whether the SocialMedia and Ideology variables are discrete or continuous. At the most abstract level, the same principles apply either way, but the concrete implications for interpretation are somewhat different.

The common abstract principle is this: when you have a model containing X, Y, and their interaction X#Y, the coefficient of X is no longer "the effect of X." It is the effect of X conditional on Y = 0. Similarly, the coefficient of Y is not "the effect of Y" in the interaction model: it is the effect of Y conditional on X = 0.

Now, depending on how the variables X and Y are coded, the conditions X = 0 and Y = 0 may not be instantiated in the data, and may not even be theoretically possible. Consequently these coefficients in their own right are entirely meaningless. If values of X or Y are sometimes zero, then they are not entirely meaningless, but unless a zero value of X or Y has some real importance in its own right, these coefficients represent at best a curious and obscure fact of no importance. The bottom line, then, is that in most circumstances, in an interaction model, the coefficients of X and Y themselves should be ignored. What you want to look at instead are things like the marginal effects of X and Y at important, meaningful values of the other, or perhaps averaged over all values of the other.

If X and Y are dichotomous variables, it's rather simple, because we have to consider only the marginal effect of X for two values of Y (one of which is often 0 and the other often is 1), and vice versa. These are easy to calculate after a regression using -lincom-. But there is an easier way, using the -margins- command, which will help you with polytomous and continuous cases as well. However, to use it, you have to redo your regressions, using factor variable notation (see -help fvvarlist- for details) rather than a homebrew interaction variable. So for discrete variables

Code:

reg Nontraditional i.SocialMedia##i.Ideology /*rest of your variables here--I'm not going to write them all out*/ , robust margins SocialMedia, dydx(Ideology) margins Ideology, dydx(SocialMedia)

If, say, Social Media is discrete but Ideology is on a continuous scale, you need to first select the values of Ideology that interest you in terms of wanting to know the marginal effect of SocialMedia at those values of Ideology. Typically, such a list roughly spans the observed range of the data and includes a moderate number of points in between:

Code:

reg Nontraditional i.SocialMedia##c.Ideology margins SocialMedia, dydx(Ideology) margins, dydx(SocialMedia) at(Ideology = (list_of_interesting_values_of_Ideology))

If both are continuous, then you need to first identify list of interesting values of both variables. The code is

Code:

reg Nontraditional c.SocialMedia##c.Ideology margins, dydx(SocialMedia) at(Ideology = (list_of_interesting_values_of_Ideology)) margins, dydx(Ideology) at(Ideology = (list_of_interesting_values_of_SocialMedia))

By the way, if you just do something like

Code:

margins, dydx(Ideology)

you will get the average mariginal effect of Ideology, averaged over the observed distribution of SocialMedia, a figure that may or may not be relevant to your research goals.

For an exceptionally lucid and thorough explanation of interaction models, I recommend you read the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats2/l53.pdf.

For future reference, please read the Forum FAQ, with particular attention to #12, where the most effective ways of displaying example data, code, and Stata output are explained. In particular, screenshots are strongly deprecated. Among the reasons is that they are often are unreadable on the reader's end: yours was just barely readable on my computer. Had it been a tiny bit smaller, it would have been entirely useless. Even as it is, had I been less willing to strain my eyes to read it, I would have skipped over this post.
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2400
#3

20 Nov 2021, 16:46

Please be aware that you have posted this same question to Reddit. Cross-posting here is allowed, but you are asked to notify us so that those who are willing to help do not duplicate their efforts. This is detailed in the FAQ, which you are asked to read when you sign up and post.
Comment
Greg Oliver

Join Date: Nov 2021

Posts: 2
#4

20 Nov 2021, 19:05

Clyde, thank you so much for the quick and informative reply, I am extremely appreciative.
This is my first foray into quantitative research methods, Stata and subsequently Statalist so I apologize for my rookie mistakes. I will be sure to be more cautious in the future when posting on the forum.
Comment

Announcement

Interpreting Interaction Term effect in OLS regression

Comment

Comment

Comment