Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • intercation term same value as main variable

    Dear all,

    Since I think it is more appropriate to ask this question in a different topic, I created a new topic.
    I was wondering what it means if my interaction term has the same value as one of my variables of which it is interacted with, but has the opposite sign.

    I am estimating the effect of a policy (that can exist in two forms/binary variable) on district revenue. Since not only the policy, but also the internsity matters (irrespective of the policy, the intensity is always positive), I included an interaction term. However it seems like the value of the interaction term is almost equal to the policy variable if policy==1. If the policy takes on the value of 1 because the first policy is in place, it then seems like there is no effect of intensity correct? I was wondering if this indicates some mistake.

    Code:
     xtreg lnDistrict_Revenue L.i.P##L.c.Intensity c.L.lnUrbanPopulation##c.L.lnUrbanPopulation c.L.lnPropertyvalue c.L.lnGrant##c.L.lnGrant c.L.lnIncome_percapita c.L.ShareUnemployed c.L.ShareElderly c.L.ShareYoung L.lnSpending i.Year, fe cluster(District)
     
    Fixed-effects (within) regression               Number of obs     =      3,204
    Group variable: District                        Number of groups  =        298
     
    R-sq:                                           Obs per group:
         within  = 0.7285                                         min =          6
         between = 0.0179                                         avg =       10.8
         overall = 0.0325                                         max =         11
     
                                                    F(23,297)         =     172.78
    corr(u_i, Xb)  = -0.9326                        Prob > F          =     0.0000
     
                                                              (Std. Err. adjusted for 298 clusters in District)
    -----------------------------------------------------------------------------------------------------------
                                              |               Robust
                           lnDistrict_Revenue |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ------------------------------------------+----------------------------------------------------------------
                                          L.P |
                                           1  |   .0351146   .0151471     2.32   0.021     .0053053    .0649238
                                              |
                                    Intensity |
                                          L1. |     .19091   .0975182     1.96   0.051    -.0010041    .3828242
                                              |
                             L.P#cL.Intensity |
                                           1  |  -.1870479   .1081467    -1.73   0.085    -.3998788     .025783
                                              |
                            lnUrbanPopulation |
                                          L1. |   2.370568   1.676327     1.41   0.158    -.9284165    5.669553
                                              |
    cL.lnUrbanPopulation#cL.lnUrbanPopulation |  -.1596966   .0787233    -2.03   0.043    -.3146229   -.0047704
                                              |
                              lnPropertyvalue |
                                          L1. |  -.7620712   .0773144    -9.86   0.000    -.9142247   -.6099178
                                              |
                                      lnGrant |
                                          L1. |   .9593062   .2999702     3.20   0.002     .3689699    1.549642
                                              |
                        cL.lnGrant#cL.lnGrant |  -.0228989   .0081084    -2.82   0.005    -.0388561   -.0069416
                                              |
                           lnIncome_percapita |
                                          L1. |   .0228362   .1156852     0.20   0.844    -.2048304    .2505027
                                              |
                              ShareUnemployed |
                                          L1. |  -.0163021   .0095847    -1.70   0.090    -.0351648    .0025605
                                              |
                                 ShareElderly |
                                          L1. |  -.0026668    .002733    -0.98   0.330    -.0080453    .0027117
                                              |
                                   ShareYoung |
                                          L1. |  -.0039125    .003059    -1.28   0.202    -.0099326    .0021076
                                              |
                                   lnSpending |
                                          L1. |  -.0034049   .0030913    -1.10   0.272    -.0094886    .0026787
                                              |
                                         Year |
                                        2002  |   .0454305   .0154763     2.94   0.004     .0149733    .0758876
                                        2003  |    .110855   .0188005     5.90   0.000      .073856     .147854
                                        2004  |   .1813692   .0234291     7.74   0.000     .1352611    .2274773
                                        2005  |   .1985591   .0317878     6.25   0.000     .1360012     .261117
                                        2006  |   .1323212   .0377318     3.51   0.001     .0580657    .2065767
                                        2007  |   .1298228    .042398     3.06   0.002     .0463842    .2132615
                                        2008  |   .1263483   .0453511     2.79   0.006     .0370981    .2155986
                                        2009  |   .1168673   .0481765     2.43   0.016     .0220568    .2116778
                                        2010  |   .1327302   .0562972     2.36   0.019     .0219383    .2435221
                                        2011  |   .1738778   .0612029     2.84   0.005     .0534314    .2943241
                                              |
                                        _cons |  -5.173469   8.324948    -0.62   0.535    -21.55683    11.20989
    ------------------------------------------+----------------------------------------------------------------
                                      sigma_u |  .66077134
                                      sigma_e |  .06458734
                                          rho |  .99053626   (fraction of variance due to u_i)
    ---------------------------------------------------------------------------------------------------

    When I split the sample based on policy (0 or 1) instead of the interaction term, It seems like there is no effect of intensity when policy 1 as compared when policy is 0. I know this is different because I am the interacting every variable.


  • #2
    If the policy takes on the value of 1 because the first policy is in place, it then seems like there is no effect of intensity correct?
    Contrary to popular usage, it is not really correct to characterize a zero or near-zero marginal effect as "no" effect. It may be small. And perhaps its confidence interval overlaps zero, but you would need an infinitely large study to actually assert that the effect is exactly zero.

    Pedantics aside, you're correct. If you would like the exact marginal effect of Intensity when Policy = 1 you can run
    Code:
    margins, dydx(L.Intensity) at(L.P = 1)

    Comment


    • #3
      Thank you Clyde, I indeed ran the margins command. Does it indicate some misspecification though? Or is it common when there is simply a near-zero marginal effect for some sub group? I ran my regression several times with different independent variables and different lags, but the two variabes almost always cancel eachother out.

      Comment


      • #4
        It does not indicate model mis-specification. There is no reason it can't turn out that way. I can't say whether it's common or not: it depends on what research questions you are investigating and how you design your studies. Sometimes you design a study with the intent that in one group the marginal effect of some variable will be (near) zero--in effect, a control group. In those designs, you will very commonly see the kind of result you got. But even if you didn't specifically design your study that way, there is no law that says it can't work out that way in naturalistic data.

        Does theory or previous research in this area suggest that this result is implausible? If so, then I would think hard about how the variables were operationalized, whether there are data errors, or whether there is something peculiar about the population from which your data is sampled.

        Comment


        • #5
          Thank you Clyde,

          Your explanation is very clear and understandable for students. Thank you for that. Your answer gets me thinking further which is only a good thing.

          Comment


          • #6
            Again thank you Clyde for your wonderful help.

            I was wondering one more thing. If one runs a regression in the form of Y=b1*x1+b2*x2+b3*x1*x2
            The main effects are conditional on the other variable being 0. If the main effect of x1 is highly insignificant but x2 and the interaction are. Can we then say that the effect of x1 being 1 and x2 being 3 is: b2*3+b3*1*3
            so in a sense do we leave the value of b1 completely out even if x2 is non zero because it is insignificant? Can we then say that x1 only has a effect though the interaction?

            Comment


            • #7
              Can we then say that the effect of x1 being 1 and x2 being 3 is: b2*3+b3*1*3
              so in a sense do we leave the value of b1 completely out even if x2 is non zero because it is insignificant? Can we then say that x1 only has a effect though the interaction?
              In the situation you describe:

              The expected value of Y when x1 = 1 and x2 = 3 is b1*1 + b2*3 + b3*1*3 (or, more simply, b1 + 3*(b2+b3)).

              The marginal effect of x1 on Y when x1 = 1 and x2 = 3 is b1 + b3*3 (i.e. b1 + 3b3).

              The expression b2*3+b3*1*3 doesn't correspond to anything that has a name, and I can't even think of any way in which it would be meaningful.

              It is a fallacy, and, unfortunately, a very widespread one, to treat a non-statistically significant coefficient as if it were zero. Even for those who take statistical significance seriously as a concept, that is an incorrect interpretation. The closest you can come to a correct statement along those lines is that if a coefficient is not statistically significant, the model and data are inconclusive as to whether the coefficient is zero or not.

              Comment


              • #8
                The closest you can come to a correct statement along those lines is that if a coefficient is not statistically significant, the model and data are inconclusive as to whether the coefficient is zero or not.
                Thank you Clyde,

                I assume that this statement is true then for any insignificant coefficient in the model, part of the interaction term or not. In a sense then you don’t really treat the coefficients part of an interaction term any differently.
                if I am truely focussing on significance, which you can debate about ofcourse. Would it be correct is I make this statement:
                The coefficient b1, is not significantly different from zero, meaning that it inconclusive if x1 has a direct effect on y for different levels of x1 and x2. The coefficient of the interaction term is significant, which suggests that x1 has an effect together with x2 depending on the level of x2.

                Basically I am wondering if you can say that the insignificance of b1 means that you are inconclusive about any level of x1 and x2 and not just x2=0
                for the “main” effect

                edit: I assume you can just say for x2=0, since that is what you are testing for. If so, is there any way to test for the other levels of x2 if this is the case?
                Last edited by Mimina John; 20 Aug 2022, 19:55.

                Comment


                • #9
                  Basically I am wondering if you can say that the insignificance of b1 means that you are inconclusive about any level of x1 and x2 and not just x2=0
                  for the “main” effect
                  No, you definitely can't say that. In fact, in an interaction model, there are always values of x2* for which the marginal effect of x1 (conditional on that value of x2) will be statistically significant. You cannot conclude anything about the marginal effect of x1 at any value other than x2 = 0 from just the coefficient b1.

                  is there any way to test for the other levels of x2 if this is the case?
                  Sure. The marginal effect of x1 for a given value, call it A, of x2 is b1 + b3*A. The easiest way to get this out of Stata is
                  Code:
                  lincom _b[x1] + A*_b[x1#x2]
                  Replace A there by the actual numeric value of x2 you are interested in and replace x1 and x1#x2 by the corresponding terms exactly as they appear in the regression output.

                  *Added: The values of x2 that do this may or may not fall within the range of observed, or even theoretically possible, values of x2. But just mathematically, for all sufficiently large (in magnitude) x2, the marginal effect of x1 conditional on that value of x2 will be statistically significant.
                  Last edited by Clyde Schechter; 20 Aug 2022, 20:30.

                  Comment


                  • #10
                    Thank you Clyde.

                    Sure. The marginal effect of x1 for a given value, call it A, of x2 is b1 + b3*A.
                    In this case, I am just interested in the part that is explained by b1 and if it is statistically significantly different from zero. For x2=0 the results suggest it is not. But I was wondering if there is any way of finding out for x2=1 for example. Correct me if I am wrong, but if I use lincom, then I am testing the “whole” marginal effect and not just the part that is explained by b1 correct?

                    Comment


                    • #11
                      What does "the part that is explained by b1" mean?

                      Comment


                      • #12
                        It’s probably not possible but:

                        b1 + b3*A

                        I ofcourse have the coefficient b1, but I would like to check if it statistically different from zero and not just b3*A that is driving the results. If I am using lincom, I am testing the marginal effect of x1 at different levels of A. Just looking at the p value will then be enough to conclude that the outcome suggests that the direct effect (b1) of x1 is statistically different from zero at that level of A and that it is not the ‘indirect’ effect in combination with A (b3*A) that is driving the results?

                        Lets say that I have a binary variable gender (x1) and continuous variable years of education (x2) and the interaction term (x3) and wage as outcome (y). In the regression, the binary variable is insignificant while x2 and x3 are significant. If we were looking at significance, we can say that we cannot conclude that there is a main effect of gender at 0 years of education.
                        If I however want to know the marginal effect at 1 year of education I can simply plug in 1 for A. If it comes back as significant, can I then conclude that b1 in b1+b3*A is significant as well for A=1?

                        Comment


                        • #13
                          can I then conclude that b1 in b1+b3*A is significant as well for A=1?
                          I'm going to do my best to respond to this, but bear in mind where I am coming from. I am among those who do not take statistical significance and p-values seriously (except in a few unusual situations). So I'm in the position of trying to help you do the best job you can of something that I believe is inherently inappropriate.

                          In an interaction model, it isn't really possible to separate the effects of two variables that have an interaction. The effect of one is always contingent on the value of the other. For that reason, it is not meaningful to talk about the statistical significance of either variable's effect in isolation. If you wanted to know, for example, if overall x1 has a statistically significant influence on y in the model y = b0 + b1*x1 + b2*x2 + b3*x1*x2 + error, the appropriate test would be -test x1 x1#x2-, the joint significance test of x1 and the interaction term. The only thing you can learn from a significance test on x1 alone is whether the marginal effect of x1 conditional on x2 = 0 is statistically significant. In many situations, that's not even a meaningful question to ask because x2 = 0 may be beyond the range of observed values of x2, or even beyond the theoretically possible range of values of x2. When x2 = 0 is a possible value of x2, then it is, at least, a meaningful question, but only occasionally is it of any special importance. Most likely this is the case for x2 as education: 0 years is possible but probably uncommon in the data and of little real importance.

                          That conditional marginal effect is the only interpretation you can assign to b1, and its statistical significance does not change when you then use it in some other calculations, such as b1 + A*b3.

                          If A is sufficiently large, then, of course, A*b3 will dominate the calculation of b1 + A*b3. At a sufficiently large value of A (which may or may not be a realistic value for x2), in fact, the b1 term will be relatively negligible and you could just approximate the whole thing by A*b3. Similarly, when A is sufficiently close to 0, b1 will dominate the calculation, and you could even approximate the whole thing by just b1. That's algebra/calculus. It's not statistics. None of that says anything about the statistical significance of b1. The statistical significance of b1 is the one and only result shown in that row of the regression output table, and it doesn't change.

                          Comment


                          • #14
                            Again thank you Clyde,

                            I never noticed you replied. Your explanation is very clear and helps my understanding. Sometimes I wish statistics was as clearly explained during class!

                            I have one more question. What exactly to you mean by

                            That conditional marginal effect is the only interpretation you can assign to b1, and its statistical significance does not change when you then use it in some other calculations, such as b1 + A*b3


                            Last edited by Mimina John; 23 Aug 2022, 06:06.

                            Comment


                            • #15
                              The only interpretation of b1 is that it is the marginal effect of X1 conditional on X2 = 0. Whether b1 is statistically significant or not depends only on b1 itself, its standard error, and in some models, on residual degrees of freedom. All of those are calculated directly from the data and are in no way contingent on planned or actual subsequent uses of b1. The fact that you might later use it in a formula b1 + A*b3 doesn't change any of that, so it doesn't change the statistical significance of b1.

                              Comment

                              Working...
                              X