Interpretation of interaction in log-linear model

Surya Singh

Join Date: Sep 2014

Posts: 54
#1

Interpretation of interaction in log-linear model

09 Mar 2018, 07:30

Hi all,

I'm estimating a log-linear model with panel data using xtreg where my model is as follows:

log(y₁/y₂) = B₀ + B₁(policy) + B₂(infant)+ B₃(policy*infant) + B₄log(x₁) + B₅ (x₂) + B₆(year) + B₇(state)

where policy = 0 for control individuals living in states before the policy and = 1 for treated individuals living in states after the policy was introduced (the policy introduction is staggered across time)

infant = 1 only at the year the individual has an infant and = 0 otherwise

x₁ is log of wages
x₂are other demographic and economic factors that are not log-transformed

How can I interpret the coefficient on B₃which is the instantaneous effect of the policy only during the year individual has an infant?

Any help is appreciated!

Surya
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

09 Mar 2018, 09:17

Before answering your question, I notice that your model involves log(x1) and x1 is itself log of wages. So your model includes log log wages. There's nothing illegal about that, but it's very unusual and hardly ever seen. Are you sure this is what you want?

Your coefficient is the difference between the effect of policy on the expected value of log(y1/y2) when there is an infant born that year, and the effect when there is no infant born that year. Equivalently, you can also interpret it as the difference between the effect of birth of an infant in that year on the expected value of log(y1/y2) when and where the policy has been adopted and the same effect when and where the policy has not been adopted.
Comment

Surya Singh

Join Date: Sep 2014
Posts: 54

11 Mar 2018, 08:40

Originally posted by Clyde Schechter View Post

Before answering your question, I notice that your model involves log(x1) and x1 is itself log of wages. So your model includes log log wages. There's nothing illegal about that, but it's very unusual and hardly ever seen. Are you sure this is what you want?

Your coefficient is the difference between the effect of policy on the expected value of log(y1/y2) when there is an infant born that year, and the effect when there is no infant born that year. Equivalently, you can also interpret it as the difference between the effect of birth of an infant in that year on the expected value of log(y1/y2) when and where the policy has been adopted and the same effect when and where the policy has not been adopted.

Hi Clyde,

Thanks for your answer! Apologies I made a typo, I meant x1 is wages.

So if this is my most simple model output just with my outcome and policy variable:

Code:

. xtreg diff_incomeFTE3 i.policy_DID##i.infant

Random-effects GLS regression                   Number of obs     =     24,853
Group variable: ID_child                        Number of groups  =      6,111

R-sq:                                           Obs per group:
     within  = 0.0041                                         min =          1
     between = 0.0032                                         avg =        4.1
     overall = 0.0028                                         max =         14

                                                Wald chi2(3)      =      91.00
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

-----------------------------------------------------------------------------------
  diff_incomeFTE3 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------+----------------------------------------------------------------
     1.policy_DID |  -.6275243    .085002    -7.38   0.000    -.7941251   -.4609235
         1.infant |   .3175638   .0917841     3.46   0.001     .1376703    .4974572
                  |
policy_DID#infant |
             1 1  |  -.8637678   .2425685    -3.56   0.000    -1.339193   -.3883424
                  |
            _cons |   2.603681   .0620968    41.93   0.000     2.481973    2.725388
------------------+----------------------------------------------------------------
          sigma_u |  3.5910804
          sigma_e |  3.8081393
              rho |  .47068985   (fraction of variance due to u_i)
-----------------------------------------------------------------------------------

.

Could I also interpret my policy*infant coefficient as e^(-0.8638) -1 * 100% = 56% percentage point decrease in the outcome due to the policy when the infant is born?
Or would I take in consideration the coefficients of the policy and infant variable as well since it is an interaction term?
For example, e^(-0.6275 + 0.3176 + - 0.8638) = 1 *100 % = 69% percentage point decrease in the outcome for those living in treated states when they had an infant?

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#4

11 Mar 2018, 10:34

Could I also interpret my policy*infant coefficient as e^(-0.8638) -1 * 100% = 56% percentage point decrease in the outcome due to the policy when the infant is born?

No. Tow things, first the calculation exp(-0.8638)-1 * 100 comes out as -.578.... But more important, that is not what the interpretation of the interaction coefficient would be. This decrease of 57.8% is the difference between the impact of the policy in the presence of a birth and the impact of the policy when there is no birth. It is not the effect of the policy under either (or any) condition. It is the difference between them.

Rather than doing these calculations and interpretations by hand, I recommend you use the -margins- command. Start by reading the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, which is very clear and also includes worked examples similar to yours. Then you can run

Code:

margins infant#policy_DID margins infant, dydx(policy_DID) margins policy_DID, dydx(infant)

The first of these will show you the probability of diff_incomeFTE3 in each combination of infant and policy_DID. The second will give you the marginal effects of policy_DID in both infant conditions, and the last will give you the marginal effects of infant in each condition of policy_DID.

One other point. When describing results, it is best to avoid using causal language such as "due to." The outcomes are associated with those differences in the predictors, but they may or may not be "due to" those predictors unless you are working with experimental data.
Comment
Surya Singh

Join Date: Sep 2014

Posts: 54
#5

11 Mar 2018, 11:01

Originally posted by Clyde Schechter View Post

No. Tow things, first the calculation exp(-0.8638)-1 * 100 comes out as -.578.... But more important, that is not what the interpretation of the interaction coefficient would be. This decrease of 57.8% is the difference between the impact of the policy in the presence of a birth and the impact of the policy when there is no birth. It is not the effect of the policy under either (or any) condition. It is the difference between them.

Rather than doing these calculations and interpretations by hand, I recommend you use the -margins- command. Start by reading the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, which is very clear and also includes worked examples similar to yours. Then you can run

Code:

margins infant#policy_DID margins infant, dydx(policy_DID) margins policy_DID, dydx(infant)

The first of these will show you the probability of diff_incomeFTE3 in each combination of infant and policy_DID. The second will give you the marginal effects of policy_DID in both infant conditions, and the last will give you the marginal effects of infant in each condition of policy_DID.

One other point. When describing results, it is best to avoid using causal language such as "due to." The outcomes are associated with those differences in the predictors, but they may or may not be "due to" those predictors unless you are working with experimental data.

Ah ok!! Thank you for the reference to the margins command. I will make sure to read that!

One last thing, say the coefficient on policy_DID*infant is significant p<0.05 when I run xtreg but when running the margins command after, for example margins infant, dydx (policy_DID) the marginal effects of policy_DID in the condition that the infant = 1 is insignificant, do I interpret the association then to be insignificant? or stick with the significance in the xtreg model?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#6

11 Mar 2018, 11:19

The coefficient of the interaction term in the regression output estimates the difference between the effects of policy under two different conditions. If it is significant, that means that the two effects it is comparing significantly differ from each other. Two effects may differ from each other significantly regardless of whether either of them is, by itself, significantly different from zero. So there is not a single conclusion to be drawn. Rather you have to separately report the effects of the policy in each condition and also report the difference between them. Each of these three things can be either statistically significant or not independently of the others.
Comment
Uwe Schmitt

Join Date: Mar 2021

Posts: 6
#7

04 Mar 2021, 08:16

I have a related question to this post: After running the margins command, is it correct to take the results of dydx and estimate them as follows: exp(result)-1 * 100 to get the marginal effect?

Last edited by Uwe Schmitt; 04 Mar 2021, 08:18.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#8

04 Mar 2021, 11:44

No, it's not.

The outcome variable in the regression (let me call it y for short) is itself the log of a different variable z. You are interested in the marginal effect of x on the untransformed variable z, dz/dx, I presume.

Calculus: dz/dx = d(exp(y))/dx = exp(y)*dy/dx. dy/dx is the result seen in -margins, dydx(x)-. So you have to take the margins result and multiply it by exp(y), where y is the outcome variable.and you have to average that over all the observations. That's complicated. But you can do it directly in -margins- as

Code:

margins, dydx(x) expression(exp(predict(xb)))
Comment
Uwe Schmitt

Join Date: Mar 2021

Posts: 6
#9

05 Mar 2021, 02:52

Thank you so much for your response Clyde! I tried this code but not sure if that's correct. I think I formulated the question in the wrong way.

What I am interested in is the correct interpretation, even if the marginal effect of x is on the transformed variable. I want to be able to interpret: A one unit increase in x, let's say the female sample, increases y by ...percent. Like I'd do it in a log linear model without interaction term, but here for the interaction term. The coefficients are just very high (<0.3) That's why I applied the transformation.

Last edited by Uwe Schmitt; 05 Mar 2021, 03:19.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#10

05 Mar 2021, 12:02

So if you have a non-log-transformed outcome variable z and you want to know the percent change of z associated with a unit increase in the dichotomous variable x, that's:

Code:

margins x, eydx(z)

If, however, z is itself the logarithm of y and you want to know the relative change in y associated with a unit increase in x, again we can use calculus (as an approximation since x is actually discrete here):

z = log y
y = exp(z)

Rel change in y for unit change in x is (1/y) dy/dx = exp(-z) * d(exp(z))/dx = exp(-z)*exp(z)*dz/dx = dz/dx. In other words, the semi-elasticity of x on y you seek is the same as the simple marginal effect of x on z

Code:

margins x, dydx(z)
Comment
Uwe Schmitt

Join Date: Mar 2021

Posts: 6
#11

05 Mar 2021, 12:47

Thank you so much for your response Clyde!
Honestly I am not sure if that describes my problem. English is not my native language and I think I did not formulate my question well.

I understood this part.
"In other words, the semi-elasticity of x on y you seek is the same as the simple marginal effect of x on z"

However, the result of the following coding does not give me the value in percentage, right? After the margins, dydx command, I have to
calculate exp(result)-1 * 100 for each category (0/1), right?

My code is:

reg log_variable_1 c.variable_2##i.dummy_3 control_variable_4 control_variable_5, r

margins, dydx( variable_2) at (dummy_3=(0 1))

Now I want to know how the percentage change of log_variable_1 is if variable_2 increases by one unit depending on dummy_3

Once for dummy_3 1 and once for dummy_3 0

I would appreciate it very very much, if you could give me one last answer since this is very important for me.

Last edited by Uwe Schmitt; 05 Mar 2021, 13:16.
Comment

Announcement