Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction of two continuous variables

    I have questions about the interaction model of two continuous variables:

    assume that these two vars are X1 and X2.

    1. what is the interpretation of X1 and X2 in interaction model (assume that model is: y=a0+a1*X1+a2*X2+a3*X1*X2+error term).
    Is the interpretation of a2 this: effect of X2 on y when X1=0.

    2.How can I interpret a2 (or a1) and a3 if:
    1. a2 is significant and positive in main model (without interaction model), and is negative and significant in interaction model.
    2. a2 is insignificant and positive in main model (without interaction model), and is negative and significant in interaction model.
    3. a2 is significant and positive in main model (without interaction model), and is negative and insignificant in interaction model.
    4. a2 is significant and positive in main model (without interaction model), and is negative and significant in interaction model.
    or any general rule for such interpretations.

    3. Can I use logX1 and logX2 as independent vars in interaction model and X1*X2 as interaction term same as below:

    Model: y=a0+a1*logX1+a2*logX2+a3*X1*X2+error term

    or other transformation of X1,X2 and X1*X2.

    Thank you in advance for your help and responses.
    Best regards.


    Last edited by Michael Lee; 02 Jul 2022, 06:58.

  • #2
    MIchael:
    why not sharing what you typed and what Stata gave you back (as per FAQ), instead of drafting your query in a multiple choice test-like format? Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      In short, adding an interaction to a model, regardless of type or significance of anything, is to state an assumption or admission that there is no single (common) effect of either variable. Instead, one may be seen to modify the effect of the other. The models with interaction makes a1 and a2 conditional on each other, so those coefficients cannot be directly compared to the model without interaction.

      To look at the effect of a1, will vary with a2, and vice versa. The amount by which a1 varies depends again on the specific value of a2. Statistical significance has nothing to say here.

      Evidence of a statically significant interaction term shows evidence that there is effect modification in your dataset. However, lack of statistical significance says nothing about the true relationship, only that there is no evidence in support of interaction with this data. Sometimes this is taken as permission to simplify the model to one without the interaction term.

      Comment


      • #4
        Interactions involving continuous variables tend to hurt my head. I think it helps if you center the variables and/or graph the relationship. For more see

        https://www3.nd.edu/~rwilliam/stats2/l55.pdf

        Also you have to be careful about interpreting main effects once interaction effects are added to the model. See

        https://www3.nd.edu/~rwilliam/stats2/l53.pdf
        -------------------------------------------
        Richard Williams
        Professor Emeritus of Sociology
        University of Notre Dame
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://academicweb.nd.edu/~rwilliam/

        Comment


        • #5
          Thank you Leonardo for your precise answer.
          And thank you Richard for sharing these insightful resources. They helped a lot.

          And could you help me in finding the last question's answer too?

          Thank you again!
          Last edited by Michael Lee; 02 Jul 2022, 10:54.

          Comment


          • #6
            Your variables should enter your interaction terms the same way they appear in as constituent main effects. In you example, I would not know what it means to have say logX1 and logX2 and the interaction of X1*X2.

            Comment


            • #7
              I agree with Leonardo: I would interact logX1 and logX2 if those are the level terms. And, as Richard said, you should center around interesting values, such as the means, to give the main effects an interpretation as average partial effects.

              Comment


              • #8
                Interpretation in such models is seen by taking derivatives. Your model is

                E(y|X1,X2)=a0+a1*X1+a2*X2+a3*X1*X2,

                and then the marginal effect of X2 on E(y|X1,X2) is

                dE(y|X1,X2)/dX2 = a2 + a3*X1, so

                the marginal effect of X2 on E(y|X1,X2) is not a constant but a function of X1.

                You can evaluate this marginal effect at any value of X1 of interest, e.g., at the mean/median of X1, at the 95 percentile of X1, or at a specific values such as X1=666.

                If you center your X1 at 666, your marginal effect becomes

                dE(y|X1,X2)/dX2 = b2 + b3*(X1 - 666), and if you evaluate this function at 666 you see that the marginal effect dE(y|X1,X2)/dX2 = b2 at X1=666, that is, if you center, the marginal effect of X2 on E(y|X1,X2) becomes simply the parameter on X2, when evaluated at X1=666, so you can disregard the interaction term. This is the sense in which centering helps interpretation.

                Here is an example:

                Code:
                . clear
                
                . sysuse auto
                (1978 automobile data)
                
                . gen double mpghead = mpg*headroom
                
                . gen double headroom666 = headroom - 666
                
                . gen double mpghead666 = mpg*headroom666
                
                . reg price headroom mpg mpghead
                
                      Source |       SS           df       MS      Number of obs   =        74
                -------------+----------------------------------   F(3, 70)        =      7.14
                       Model |   148844287         3  49614762.3   Prob > F        =    0.0003
                    Residual |   486221109        70  6946015.85   R-squared       =    0.2344
                -------------+----------------------------------   Adj R-squared   =    0.2016
                       Total |   635065396        73  8699525.97   Root MSE        =    2635.5
                
                ------------------------------------------------------------------------------
                       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                    headroom |   1224.462   1963.955     0.62   0.535    -2692.523    5141.446
                         mpg |  -42.41371   273.6704    -0.15   0.877     -588.232    503.4046
                     mpghead |    -76.377   94.22537    -0.81   0.420    -264.3036    111.5496
                       _cons |   8119.723   6001.821     1.35   0.180    -3850.533    20089.98
                ------------------------------------------------------------------------------
                
                . dis "d(price)/d(mpg) at (headroom = 666) = " -42.41371 +(-76.377)*666
                d(price)/d(mpg) at (headroom = 666) = -50909.496
                
                . reg price headroom666 mpg mpghead666, noheader
                ------------------------------------------------------------------------------
                       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                 headroom666 |   1224.462   1963.955     0.62   0.535    -2692.523    5141.446
                         mpg |   -50909.5    62486.8    -0.81   0.418    -175535.5    73716.49
                  mpghead666 |    -76.377   94.22537    -0.81   0.420    -264.3036    111.5496
                       _cons |   823611.2    1302133     0.63   0.529     -1773412     3420634
                ------------------------------------------------------------------------------
                
                .




                Comment


                • #9
                  As for doing logs of the levels and interaction of the (not logged) levels, the computer would not break if you do it. However I agree with the previous given advice that you should not do it, and that you should not do it, can be justified by the "principle of marginality":
                  The principle of marginality implies that, in general, it is wrong to test, estimate, or interpret main effects of explanatory variables where the variables interact or, similarly, to model interaction effects but delete main effects that are marginal to them.
                  Here are some references on this "principle of marginality"

                  Ezquerra, L., Kolev, G. I., & Rodriguez-Lara, I. (2018). Gender differences in cheating: Loss vs. gain framing. Economics Letters, 163, 46-49.

                  Nelder, J.A., 1977. A reformulation of linear models. Journal of the Royal Statistical Society. Series a (General) 140, 48–77.

                  Weisberg, S., 2014. Applied Linear Regression, Fourth Edition. John Wiley & Sons. (page 139).

                  Comment


                  • #10
                    Got it. Thank you Leonardo, Jeff and Joro for your insightful answers!

                    Comment

                    Working...
                    X