Interpretation of results of catigorical#continuous interaction in comparison to reparameterized regression model

Felix Behlau

Join Date: Mar 2019

Posts: 5
#1

Interpretation of results of catigorical#continuous interaction in comparison to reparameterized regression model

19 Mar 2019, 08:37

Hello nice to meet you

After having searched this forum and google for several days without "clear" results, I hope you can help me with this query.
I am using STATA V. 14.2

In my post I am referring to two references that I have found during my research, which already gave me some help but I hope that together with you I can fully solve the question.

Source (1): https://stats.idre.ucla.edu/stata/fa...tion-stata-12/
Source (2): https://stats.idre.ucla.edu/stata/fa...n-interaction/

The interpretation problem I am facing refers to different coefficients and siginficance levels depending on the Syntax used in the interaction term.

More specifically I am trying to understand the interaction between a dummy/categorical variable 'HILOTEC' and a continuous variable CLO_mc and their effect on my dependet variable CEP_IEM. In my model I would like to test the "moderation" effect of my HILOTEC variable (does HILOTEC moderate the CLO_mc regression on CEP_IEM)

With my data, if I run a regression using the command:

reg CEP_IEM i.HILOTEC##c.CLO_mc
I will get an output similar to 'Source 2: Case 2 categorical by continous interaction':

Where significance levels for interaction term of i.HILOTEC#c.CLO_mc are not significant:

MODEL 1

If I use the command:

reg CEP_IEM HILOTEC i.HILOTEC#c.CLO_mc the coefficients and significance levels change (also similarly to Source 2)

In this model, which is described by the authors of Source 2 as a reparameterazation of MODEL 1 (with same number of degrees of freedom and R2), both of the HILOTEC values are significant in their interaction with CLO_mc

MODEL2

Here is where I have a problem in understanding and interpreting the results (and which I was also not able to solve with the help of Source 1, because they only refer to MODEL 1).
Key questions are:
Which model do I need to refer to, to interprete the interaction?

What would the interpretation of these results be (or can it not be done at this stage)
(ALSO: what would be the interpretation, if in MODEL 2 one of the interactions would be significant and the other would not be significant)?

Thank you very much in advance for your help and best regards!
Tags: categorical, interaction, regression, syntax
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

19 Mar 2019, 13:30

The two models are, indeed, reparameterizations of the same model, but the meanings of coefficients that have the same names in the two models are very different. If you are comparing a coefficient in one model with the coefficient of the same name in the other you are not even comparing apples and oranges, you are comparing airplanes with starfish.

The coefficients in Model 2 under the HILOTEC#c.CLO_mc heading are perhaps easier to understand. The one called Lowtech is the estimate of the CEP_IEM:CLO_mc slope when HILOTEC = Lowtech. The other one, called Hightech is the estimate of the CEP_IEM:CLO_mc slope when HILOTEC = Hightech. By the way, neither of these, by themselves, can properly be called an interaction. They are just slopes. The pair of them together can be called an interaction if you like. An interaction, by definition, represents some kind of difference between slopes.

In model 1, the estimated slope of the CEP_IEM:CLO_mc relationship when HILOTEC = Lowtech is found as the coefficient of CLO_mc, and is not under the interaction term heading. You will notice that, at .5196661, it agrees with the result in Model 2. As for the Model 1 estimate of the CEP_IEM:CLO_mc slope, it does not appear directly in the output at all. To get that, you have to add the coefficient appearing as Hightech under the HILOTEC#c.CLO_mc interaction heading to that .5196661, and, lo and behold, when you do add .2935645 to .5196661 you get .8132306, which is the same result that Model 2 gives you. Interpreting models parameterized this way is somewhat difficult and takes getting used to. Stata can make that easier for you, however. If, after Model 1, you ran -margins HILOTEC, dydx(CLO_md)- you would have gotten an output that corresponds to what you got from Model 2.
1 like
Comment
Felix Behlau

Join Date: Mar 2019

Posts: 5
#3

20 Mar 2019, 03:52

Dear Clyde,

thank you very much for your quick reply which was very helpful to me.
Could you kindly elaborate also on the role of the significance levels of the different coefficients e.g. in model 1 and how they can be interpreted in this context?

From my understanding the ouput of MODEL 1 would tell me something like:
Hightech significantly influences CEP_IEM

CLO_mc significantly influeences CEP_IEM (for HILOTEC=Lowtech)

Interaction of CLO_mc and HILOTEC does not significantly influence CEP_IEM

Thank you very much in advance and best regards!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#4

20 Mar 2019, 10:15

Well, as I said, the p-values are interpreted the same for these things as any other variable. Personally, I dislike using significance testing in these models and really avoid doing it. But for the sake of brevity, I'll spare you my lengthy rant on why you shouldn't do it.

Your intepretation of model 1 is wrong. In this model (or in model 2) there is no such thing as the influence of CLO_mc on CEP_IEM, nor any such thing as the influence of HILOTEC on CLO_mc. Rather, there are two different influences of CLO_mc on CEP_IEM, one for Hightech and the other for Lowtech. And there are infinitely many influences of HILOTEC on CEP_IEM, one for each of the infinitely many possible values of the continuous variable CLO_mc.

What you can conclude directly from the output of model 1 in terms of statistical significance is:
A. There is a statistically significant effect of being Hightech (as opposed to Lowtech) when CLO_mc = 0. (p = 0.011)
B. There is a statistically significant slope of the CEP_IEM:CLO_mc relationship when HILOTEC = Lowtech. (p = 0.002)
C. The difference between the slopes of the CEP_IEM:CLO_mc relationships (Hightech vs Lowtech) is not statistically significant. (p = 0.137)

Just working directly from the output shown in Model 1 you cannot draw any conclusion about the statistical significance of the CEP_IEM:CLO_mc relationship when HILOTEC = Hightech. To get that, you would need to run the -margins- command I showed in #2.

I will once again emphasize that I do not regard any of these conclusions, or others relying on statistical significance, as interesting, useful, or even meaningful.
Comment
Felix Behlau

Join Date: Mar 2019

Posts: 5
#5

21 Mar 2019, 06:37

Dear Clyde,

thank you very much again for your thorough explanation, which I will review in detail in the coming days.
Are there any further resources you can recommend/direct to to help (a beginner to) understand the different effects especially regarding the interpretation of results from interactions within regressions (i.e. using the margin command and looking at the slopes?)

Thank you very much in advance!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#6

21 Mar 2019, 12:20

The excellent Richard Williams https://www3.nd.edu/~rwilliam/stats/Margins01.pdf is an exceedingly clear introduction to the -margins- command, and it includes worked and explained examples similar to yours. You might also look at some of the lecture notes from his regression classes, which you can find by navigating his website.
Comment

Announcement

Interpretation of results of catigorical#continuous interaction in comparison to reparameterized regression model

Comment

Comment

Comment

Comment

Comment