Problem with coeff interpretation of a regression with interaction

Leo Davis

Join Date: Sep 2021

Posts: 10
#1

Problem with coeff interpretation of a regression with interaction

01 Oct 2021, 17:15

Hi, for the sake of my question I will use the following variables: dependent = lnwage, control1 = gender, control2 = education level

I want to do a regression on log wages and the interaction between gender and edu level.

Currently to run the regression I am using the following input

reg lnwage i.gender##edu

The base levels are gender: male and edu: level 1

From my understanding the interpretation of the coeff of female#level 2 = -0.2

Would be a female with a level 2 education receives 20% less than a male with a level 1 education. And a female with a level 3 edu receives 15% less than a male with a level 1 edu etc.. Since male, level one is the base.

However, I want to be able to get output that allows me to interpret the coeff as a female with a level 2 edu receives 20% less than a male with a level 2 edu and a female with a level 3 edu receives 15% less than a male with a level 3 edu and so on.

Basically I'd like a regression in which the females edu level is compared with a male of the same edu level, rather than a male at the base edu level, for every edu level, including the base level 1.

I hope this makes sense. I am new to stata and haven't interpreted statistical values in years.
Tags: None
Leo Davis

Join Date: Sep 2021

Posts: 10
#2

01 Oct 2021, 17:29

Please let me know if the question doesn't make sense or the desired output isn't possible. But I'm sure there must be a way to do this. Thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30062
#3

01 Oct 2021, 18:35

First, your interpretation of the interaction coefficient is wrong. Second, you cannot get the kind of results you want from a regression, but you can get what you are looking for from the -margins- command following a regression.

Going with your variables lnwage, gender, and edu, with the baselevels of gender and edu both being 1, then the coefficient of 2.gender#2.edu (which his the only interaction coefficient you will see in the regression output) is not the difference in lnwage between any of the combinations of gender and edu. Let's write your model as an equation, and the algebra is a little easier if we code gender and edu as 0 and 1 rather than 1 and 2:

lnwage = b0 + b1*gender + b2* edu + b3*gender*edu + error term

Then you can make the following table:

gender edu E(lnwage)

0 (male) 0 (level 1) b0

0 (male) 1 (level 2) b0 + b2

1 (female) 0 (level 1) b0 + b1

1 (female) 1 (level 2) b0 + b1 + b2 + b3

It is then apparent that b3, the interaction coefficient, is not the difference between any of these two groups.

Next, you have also used the heuristic that a 0.2 change in lnwage corresponds to a 20% change in wage. That heuristic is based on an approximate formula that is only a good approximation when the change in lnwage is small, say < 0.1. If you were to use the exact formula, a coefficient of -.2 corresponds to about an 18% decrement in wage.

As you can see, it is a bit complicated to see what is going on when reading the coefficients of an interaction regression. The -margins- command makes things much simpler. So if you run

Code:

regress lnwage i.gender##i.edu margins gender#edu

will show you the expected values of lnwage in each combination of gender and edu. And you can also get the marginal effect of sex at each given edu level by running:

Code:

margins edu, dydx(gender)

That is, you will see the difference between males at edu 1 and females at edu1, and the difference between males at edu2 and females at edu2.

I recommend you read https://www3.nd.edu/~rwilliam/stats2/l53.pdf. It's a very clear and thorough explanation of interaction models, from the excellent Richard Williams.
Comment
Badiah Eljahimi

Join Date: Apr 2021

Posts: 39
#4

02 Oct 2021, 08:52

Hi Clyde,

I still confused in terms of marginal effect and need your help please

I am working on a Panel data model. I want to measure the marginal effect of the interaction terms FDX * INF, FDX * INFVOL , FDX² * INF and FDX² * INFVOL on GDP

My model is multiplicative interaction models.
GDP = β1 FDX + β2 FDX²+ β3 INF + β4 INFVOL+ β5 FDX * INF + β6 FDX * INFVOL + β7 FDX² * INF + β8 FDX² * INFVOL + β9 INIGDPPC + β10 GOV + β11 GFCF + β12 TRD + β13 LBOR

by examining the partial derivative of GDP, as follows:

∂GDP/∂FDX = β1 + 2 β2 FDX + β5 INF + β6 INFVOL + 2 β7 FDX * INF + 2 β8 FDX * INFVOL

I performed GMM command

xtabond2 rgdpg ihs_inigdppc_lag1 fdxs2 fdxsquar2 ihs_inf c.fdxs2#c.ihs_inf c.fdxsquar2#c.ihs_inf ihs_gfcf ihs_gov ihs_trd ihs_lbor, gmm(rgdpg ihs_inigdppc fdxs2 fdxsquar2 c.fdxs2#c.ihs_inf c.fdxsquar2#c.ihs_inf , lag(2 2) collapse eq(diff)) iv(ihs_inf ihs_gfcf ihs_gov ihs_trd ihs_lbor, eq(diff)) gmm(rgdpg ihs_inigdppc fdxs2 fdxsquar2 c.fdxs2#c.ihs_inf c.fdxsquar2#c.ihs_inf, lag(2 .) collapse eq(level)) twostep robust

I am trying to compute the standard error Using the covariance matrix, the variance

σ^2(dy/dx) = Var ( β1) + 4 FDX² var( β2) + INF²var (β5) + INFVOL² var (β6) +4 FDX² * INF² var( β7) + 4 FDX² * INFVOL² var (β8) + 4 FDX cov( β1β2) + 2INF cov(β1β5) + 2INFVOL cov(β1β6) + 4 FDX * INF cov(β2 β5) + 4 FDX * INFVOL cov(β2 β6) + 4 FDX * INF cov( β1β7) + 8 FDX² * INF cov( β2 β7) + 4 FDX * INFVOL cov( β1β8) + 8 FDX² * INFVOL cov( β2 β8) +4 FDX * INF² cov (β5β7) +4 FDX * INFVOL² cov (β6β8)

1- Can I run this equation using marginal command, if yes, what is the command please?
2- What the marginal command for interaction term say FDX * INF at Mean, Minimum and Maximum ?

Thank you

Badiah
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30062
#5

02 Oct 2021, 10:34

I want to measure the marginal effect of the interaction terms FDX * INF, FDX * INFVOL , FDX² * INF and FDX² * INFVOL on GDP

Stop right there. There is no such thing as the marginal effect of an interaction term. As you have noted yourself, marginal effects are first order partial derivatives. Interaction terms are asociated with second order mixed partial derivatives.

You would be able to get marginal effects for all these variables if you made full use of factor variable notation. Your command only makes partial use of it, and as a result, application of -margins- would give incorrect results.

The way to revise your command is to get rid of your hand-calculated squared variables and instead rely on factor variable notation to emulate them in the regressions. So, eliminate terms like fdxsquar2, and replace them by fdxsquar#fdxsquar. Also, more generally, use the ## operator instead of the # operator to assure that all necessary subinteractions will be automatically included. If you want to represent an interaction between X and X² and Y, do that as c.X##c.X##c.Y. Then you can get the average marginal effect of X with -margins, dydx(X)-, or the marginal effects of X conditional on specified values of Y with -margins, dydx(X) at(Y = (list of specific values of Y))-.

Now, I am not a user of -xtabond2-, which is not an official Stata command. Your use of it suggests that it does support factor-variable notation, so I'm inferring that it does so fully, and will support this approach. If it does not, I'm afraid I can't offer you a workaround as I do not know much about its underlying mathematics.
Comment

gender	edu	E(lnwage)
0 (male)	0 (level 1)	b0
0 (male)	1 (level 2)	b0 + b2
1 (female)	0 (level 1)	b0 + b1
1 (female)	1 (level 2)	b0 + b1 + b2 + b3

Announcement

Problem with coeff interpretation of a regression with interaction

Comment

Comment

Comment

Comment