Interaction term interpretation

Jake Naismith

Join Date: Jul 2020

Posts: 18
#1

Interaction term interpretation

22 May 2022, 09:28

Dear all,

I hope you are doing well. I would like to ask if someone can help me understand more about interaction terms. Whenever I want to analyze interactions I feel lost. Let me present to you my model.

Change = B0 - 0.125B1(Performance) + 0.069B2(logTime) + 0.053B3(Performance*Time)

Performance and Time are two continuous variables. Performance is the performance of investors in a bank. logTime is the logarithm of time spent for investors to make a decision. Dependant Variable Change is a dummy = 1 if investor change investing strategy

Does the interaction term mean that investors with low performance ratio and higher time spent to make a decision are likely to change their behaviour?

Many thanks for your help.

Jake
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30357
#2

22 May 2022, 13:01

This is very complicated, confusing model and I would not even attempt to interpret it. Using Time in one place and logTime in the other makes the model extremely difficult to explain. I imagine that even highly experienced statisticians would have difficulty making sense of it. Go back and rerun it either using the interaction Performance*logTime. Or, use Time by itself, not its logarithm, in both places.

Code:

regression_command Change c.Performance##c.logTime // OR regression_command Change c.Performance##c.Time

Then show the results (by copy/pasting the actual Stata output into the Forum editor, between code delimiters) for more specific advice on interpretation.
Comment

Jake Naismith

Join Date: Jul 2020
Posts: 18

22 May 2022, 13:13

Originally posted by Clyde Schechter View Post

This is very complicated, confusing model and I would not even attempt to interpret it. Using Time in one place and logTime in the other makes the model extremely difficult to explain. I imagine that even highly experienced statisticians would have difficulty making sense of it. Go back and rerun it either using the interaction Performance*logTime. Or, use Time by itself, not its logarithm, in both places.

Code:

regression_command Change c.Performance##c.logTime

// OR

regression_command Change c.Performance##c.Time

Then show the results (by copy/pasting the actual Stata output into the Forum editor, between code delimiters) for more specific advice on interpretation.

Sorry for causing this confusion. Time is already in log in the interaction term.

This is the stata output.

Code:

g inter = performance_ratio * logTime
eststo: reghdfe change l.performance_ratio l.logTime l.inter, a($fe) cluster(investment)


                   (Std. err. adjusted for 18,953 clusters in investment)
------------------------------------------------------------------------------
             |               Robust
    Man2Auto | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
performanc~o |
         L1. |  -.1253476   .0036149    -8.73   0.000    -.0386328   -.0244625
             |
logtime         |
         L1. |   .0069399   .0016482    63.67   0.000     .1017094    .1081703
             |
      inter2 |
         L1. |   0.053029   .0384919    31.96   0.000     1.154586    1.305471
             |
             |
       _cons |   .1916271   .0007449   257.25   0.000     .1901671    .1930871
------------------------------------------------------------------------------

Last edited by Jake Naismith; 22 May 2022, 13:15.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30357
#4

22 May 2022, 13:48

OK. This is a linear probability model with performance_ratio as a main explanatory variable, and its effect is modified by (log-transformed) time.

The basic interpretation of this is that when logtime = 0 (i.e. time = 1 in whatever units it was measured), each unit increase in performance ratio is associated with an approximately 0.125 decrease in the probability of Man2Auto. With longer time periods, logtime increases, and consequently, the corresponding marginal effect of performance ratio on the probability of Man2Auto moves towards 0. When we reach about logtime = 2.364 (=.1253476/0.053029) (corresponding to time approx = 10.63, assuming you used the natural logarithm in creating the logTime variable), the marginal effect of performance ratio on probability of Man2Auto actually equals 0 (to within a small rounding error). At still larger values of logTime, the marginal effect of performance ratio becomes positive, meaning that increasing values of performance ratio are associated with increasing probabilities of Man2Auto. In general, for a given value of logTime, the marginal effect of performance ratio on probability of Man2Auto will be -.1253476 + 0.053029*logTime.

Note, by the way, that because this is a fixed-effects regression, the terms "increase" and "decrease" refer always to changes in values within an individual level of $fe (whatever that may be). Nothing can be said about comparisons of entities with different values of $fe.

Now, so far there is no information provided about the range of values of logTime, or the units in which Time was measured. Similarly, nothing has been said about the variable performance ratio. If, for example, performance ratio has a range restricted between, say, 0 and 1, then a unit change would probably be larger than is ever observed. Similarly, Time = 1 may well be smaller than the smallest observed value of Time. So it is unclear whether the specific examples given in the second paragraph above are meaningful in the real world. What you should do is identify specific realistic values of the performance ratio and logTime variables and then actually calculate the marginal effect of performance ratio for those combinations and make a table or graph out of them. The simplest way to do that is with the -margins- command. To use it, however, you must go back and revise your regression to use factor variable notation. That means dropping your home-brew interaction term and then running code like this:

Code:

reghdfe Man2Auto l1.c.performance_ratio##l1.c.logTime, a($fe) cluster(investment) margins, dydx(performance_ratio) at(performance_ratio = (list of values) logTime = (list of values)) marginsplot // (if you want a graph)

The output of the -margins- command will give you a table of the marginal effects of performance ratio corresponding to the values of performance ratio and logTime you specify. These numbers will probably be more useful than the abstract discussion given above.
1 like
Comment
Jake Naismith

Join Date: Jul 2020

Posts: 18
#5

22 May 2022, 14:25

Originally posted by Clyde Schechter View Post

OK. This is a linear probability model with performance_ratio as a main explanatory variable, and its effect is modified by (log-transformed) time.

The basic interpretation of this is that when logtime = 0 (i.e. time = 1 in whatever units it was measured), each unit increase in performance ratio is associated with an approximately 0.125 decrease in the probability of Man2Auto. With longer time periods, logtime increases, and consequently, the corresponding marginal effect of performance ratio on the probability of Man2Auto moves towards 0. When we reach about logtime = 2.364 (=.1253476/0.053029) (corresponding to time approx = 10.63, assuming you used the natural logarithm in creating the logTime variable), the marginal effect of performance ratio on probability of Man2Auto actually equals 0 (to within a small rounding error). At still larger values of logTime, the marginal effect of performance ratio becomes positive, meaning that increasing values of performance ratio are associated with increasing probabilities of Man2Auto. In general, for a given value of logTime, the marginal effect of performance ratio on probability of Man2Auto will be -.1253476 + 0.053029*logTime.

Note, by the way, that because this is a fixed-effects regression, the terms "increase" and "decrease" refer always to changes in values within an individual level of $fe (whatever that may be). Nothing can be said about comparisons of entities with different values of $fe.

Now, so far there is no information provided about the range of values of logTime, or the units in which Time was measured. Similarly, nothing has been said about the variable performance ratio. If, for example, performance ratio has a range restricted between, say, 0 and 1, then a unit change would probably be larger than is ever observed. Similarly, Time = 1 may well be smaller than the smallest observed value of Time. So it is unclear whether the specific examples given in the second paragraph above are meaningful in the real world. What you should do is identify specific realistic values of the performance ratio and logTime variables and then actually calculate the marginal effect of performance ratio for those combinations and make a table or graph out of them. The simplest way to do that is with the -margins- command. To use it, however, you must go back and revise your regression to use factor variable notation. That means dropping your home-brew interaction term and then running code like this:

Code:

reghdfe Man2Auto l1.c.performance_ratio##l1.c.logTime, a($fe) cluster(investment) margins, dydx(performance_ratio) at(performance_ratio = (list of values) logTime = (list of values)) marginsplot // (if you want a graph)

The output of the -margins- command will give you a table of the marginal effects of performance ratio corresponding to the values of performance ratio and logTime you specify. These numbers will probably be more useful than the abstract discussion given above.

Thank you very much for the detailed information Clyde. This really helps.
Comment

Announcement