Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Positive interaction with negative main effect: interpretation

    Apologies for the length, for I'll try to give all the relevant info in one go. Members of my research team disagree with me on the interpretation of an experiment result and I am bringing the issue here to hopefully find an expert resolution that can convince us all. Who is right?

    We study effects of different political message frames, coming from sources aligned with different parties.
    - experimentally manipulated frame varies between 3 categories: neutral (baseline), N, and C.
    - experimentally manipulated alignment varies between 3 categories: independent source (baseline), co-partisan source, opposite-party source.

    So 3 x 3 = 9 conditions, and each respondent is randomly assigned to one. Outcome of interest is Qa, which is measured on a scale of 1 to 7. The main effect of C treatment on Qa was hypothesized and found to be significantly negative. We also have a couple of interaction hypotheses:
    H3a: C frames' effect on Qa attitudes will be the greatest when it is supplied by the co-partisan elites.
    H3b: C frames' effect on Qa attitudes will be the weakest when it is supplied by the opposite elites.
    The wording of the same in the ethical review application was: C frames’ effect on Qa attitudes will be greater (lower) if it is supplied by the co-partisan (opposite) elites.

    Allowing for the usual ambiguity of ordinary language, I believe that the meaning of the hypotheses are clear. Below is how I test them, with i. oper, i.order, i.country2 as fixed effects for different iterations of the experiment, which I take to be irrelevant for the main interpretation.

    Code:
    reg Qa ib3.frame##i.alignment ib2.oper i.order i.country2, cluster( ResponseId )
    Here's the regression output:

    HTML Code:
                             |               Robust
                          Qa | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------------------+----------------------------------------------------------------
                       frame |
                          C  |   -.450276   .0725978    -6.20   0.000    -.5926267   -.3079253
                          N  |  -.2098803   .0717897    -2.92   0.003    -.3506466    -.069114
                             |
                   alignment |
         Co-partisan source  |   .0471406    .065567     0.72   0.472     -.081424    .1757052
      Opposite party source  |   -.224524   .0666856    -3.37   0.001    -.3552821   -.0937659
                             |
             frame#alignment |
       C#Co-partisan source  |  -.0078699   .1034016    -0.08   0.939    -.2106212    .1948813
    C#Opposite party source  |   .2276461   .1024118     2.22   0.026     .0268355    .4284566
       N#Co-partisan source  |  -.0797576    .100597    -0.79   0.428    -.2770096    .1174944
    N#Opposite party source  |    .109464   .1014582     1.08   0.281    -.0894767    .3084046
    Remember that C main effect was negative, and the individual coefficient for C here too is negative. The interaction of C with co-partisan source is insignificant and unsubstantial, so I cannot reject the null for H3a. But the interaction of C with opposite-party source is positive (i.e. inverse of C main effect), so I reject the null for H3b. My interpretation is, C frame is not more effective when it comes from co-partisan source, but it is indeed less effective when it comes from opposite party source, compared to an independent source.

    Now the disagreement is about H3b.

    Team member 1 disagrees with the test verdict bec they interpret the positive interaction term as strengthening of C’s effect on Qa. I explain to them that since this sign is the opposite of C’s main sign, it is actually a weakening of C’s effect, by making it less negative.

    Team member 2 is concerned that interacting with C mitigates the opposite-party source’s own negative effect on Qa and they take this to be undermining my H3b interpretation. I explain that this is irrelevant for our hypotheses, which are actually about what happens to the C effect under said interaction.

    Team member 3 concedes that the hypothesis test verdict (lesser effects from opposite party) for H3b is technically correct, but they argue that a closer look at predicted values tells a contrary story. I explain to them that there cannot be a contradiction since predicted values are generated from the same regression used for the hypothesis test. Here are the values:

    Code:
    margins ib3.frame, over(ib0.alignment)
    HTML Code:
                                   |            Delta-method
                                   |     Margin   std. err.      t    P>|t|     [95% conf. interval]
    -------------------------------+----------------------------------------------------------------
                   alignment#frame |
             Independent source#C  |   3.995489   .0571731    69.88   0.000     3.883383    4.107595
             Independent source#N  |   4.235885   .0544534    77.79   0.000     4.129112    4.342658
       Independent source#neutral  |   4.445765   .0475833    93.43   0.000     4.352463    4.539067
             Co-partisan source#C  |   4.037345   .0571095    70.69   0.000     3.925364    4.149326
             Co-partisan source#N  |   4.205853   .0542435    77.54   0.000     4.099492    4.312214
       Co-partisan source#neutral  |   4.495491   .0463339    97.02   0.000     4.404639    4.586343
          Opposite party source#C  |   3.998388   .0587389    68.07   0.000     3.883212    4.113564
          Opposite party source#N  |   4.120602   .0525447    78.42   0.000     4.017571    4.223632
    Opposite party source#neutral  |   4.221018   .0479495    88.03   0.000     4.126998    4.315038
    So team member 3 is troubled by the fact that under C treatment, there is no outcome difference between sources, and wants to conclude that C treatment is not more or less effective when it comes from opposite party source. I explain that this indifference is irrelevant unless coupled with the observation that outcomes would actually differ across sources without C treatment (i.e. neutral conditions), and C reduces the difference by affecting respondents exposed to opposite-party source less than respondents in other source conditions. I explain that the opposite-party source already has a baseline negative association with Qa under neutral frame, and this is offset by the further negative effects of switching from neutral to C frame treatment, precisely because C effect is stronger under conditions other than the opposite-party. (I should again remind here that Qa was measured from 1 to 7, so there is no strict or theoretical floor at the value of 4).

    I apologize again for the lengthy presentation. The question is, am I wrong in any of my interpretations? Are the other team members correct in any of theirs? Good people of Statalist, I thank you for your time.

  • #2
    Allowing for the usual ambiguity of ordinary language,...
    Your fatal error is allowing for the usual ambiguity of ordinary language, because that ambiguity is much larger than any of the effects you are modeling! The words stronger and weaker have no fixed relationship to positive or negative coefficients in a model. And even greater than or less than are, in ordinary language, used inconsistently when applied in settings where negative numbers come in. So you need to banish ordinary language from the discussion.

    Another semi-ordinary term that should be banished from the discussion is "main effect." It is an inherently ambiguous term. It is used to refer to the coefficient of a variable in the model when it is not part of any interaction term. The problem is, that this "main effect" depends entirely on what the reference categories of the variables it interacts with are. So, if you changed ib3.frame##i.alignment to, say ib2.frame##i.alignment, the "main effect" of C (or N, or, for that matter the alignment categories) would be something different. There is no reasonable sense in which something that is inherently ill-defined is "main." Unfortunately, the terminology is entrenched and it generates endless confusion, because, following the instincts of ordinary language usage, people tend to think of it as some "overall" or "average" effect of the variable--which it most definitely is not.

    What, then, can be said?

    The coefficient of C in the model is, to 2 places, -.45. This means that conditional on alignment taking on its reference value (Independent) then the expected value of Qa will be .45 less than the expected value of Qa in an otherwise identical subject whose frame value is neutral. A slightly less verbose way to say that is that the marginal effect of frame C compared to neutral, conditional on independent alignment, is -.45. That's the vocabulary to use: it is unambiguous, and it is definitely not ordinary language.

    Now, what could we say about an otherwise identical subject, (still with frame C), whose alignment is opposite party? Comparing that subject to another one with frame neutral and opposite party, the difference will be -0.45 + 0.23 (the latter term being the coefficient of the frame C # opposite alignment interaction term, to 2 places) So that is -.22. Comparing -.22 to -.45 it is, algebraically, a larger number, but it is smaller in magnitude. So both of these conditional marginal effects of C are in the same direction (negative sign), and that of C conditional on independent has larger magnitude.

    I would characterize trying to state that in ordinary language without leaving different listeners/readers with different impressions is a fool's errand. Just calculate and show the relevant marginal effect estimates themselves.

    Your instinct to go to the -margins- command was a good one, but unfortunately, just looking at the predictive margins, as you did, does not shed light directly on the discussion you and your colleagues were having, because to see effects, you need to then do pairwise subtractions among the 9 combinations of frame and alignment. More helpful would have been to inspect the output of:

    Code:
    margins alignment, dydx(frame)
    That would have given you directly the marginal effects of both the C and N frame values, each conditional on the three values of alignment. And that would spare you the task of figuring out which coefficients to add and subtract to get each of those marginal effects, and it would also do the calculations (including standard errors!) for you. That output, when viewed by all of you, will be unambiguous. If anyone is left confused, following it with -marginsplot- to graph the marginal effects would resolve that difficulty. No doubt, however, if you attempt to then translate the results into ordinary language, you will reach disagreement again. So don't do that! Ordinary language is simply not up to the task of interpreting these kinds of statistics.

    Comment


    • #3
      Thanks a lot for taking the time Clyde,

      To clarify, my original post actually didn’t refer to any effect of C observed in the interaction model as the main effect. I informed that the main effect was separately found to be negative, and then, in the interaction model too the individual (conditional) term for C too is negative.

      On ordinary language, you’re absolutely right. Unfortunately we pre-registered the hypotheses in that form, as is common in our field.

      The way I actually present my results to my colleagues was a marginsplot graph indeed. Here it is.
      Click image for larger version

Name:	Screenshot 2026-02-15 at 08.16.28.png
Views:	1
Size:	166.5 KB
ID:	1784865


      It appears to me that what was a difference in the control condition between frames disappears under C treatment because the (separately hypothesized and found) negative effect of C treatment is weaker under opposite-party condition.

      I’ll also reproduce the marginal effects below:

      Code:
      margins alignment, dydx(frame)
      HTML Code:
      ----------------------------------------------------------------------------------------
                             |            Delta-method
                             |      dy/dx   std. err.      t    P>|t|     [95% conf. interval]
      -----------------------+----------------------------------------------------------------
      1.frame                |
                   alignment |
         Independent source  |   -.450276   .0725978    -6.20   0.000    -.5926267   -.3079253
         Co-partisan source  |  -.4581459   .0721426    -6.35   0.000    -.5996041   -.3166877
      Opposite party source  |  -.2226299   .0741549    -3.00   0.003    -.3680339   -.0772258
      -----------------------+----------------------------------------------------------------
      2.frame                |
                   alignment |
         Independent source  |  -.2098803   .0717897    -2.92   0.003    -.3506466    -.069114
         Co-partisan source  |  -.2896379   .0712597    -4.06   0.000    -.4293649   -.1499108
      Opposite party source  |  -.1004163   .0694494    -1.45   0.148    -.2365937     .035761
      -----------------------+----------------------------------------------------------------
      3.frame                |  (base outcome)
      ----------------------------------------------------------------------------------------
      Note: dy/dx for factor levels is the discrete change from the base level.
      Here I observe that for frame 1 (which is C) the discrete change from the base level (-.22) is smaller in magnitude under opposite party source, compared to the other source conditions (-.45). The signs are also in the expected direction, considering that unconditional effect of C on Qa was separately hypothesized and found to be negative. Unless we want to completely prohibit statistical tests of natural-language hypotheses, I think this warrants a rejection of the null for H3b: "C frames' effect on Qa attitudes will be the weakest when it is supplied by the opposite elites." Would you disagree?

      Comment


      • #4
        And could you at least confirm that among the 3 objections I described, Team member 1 is incorrect? That could give us something to work on.

        Comment


        • #5
          The term "main effect" needs to vanish, as it enables misconceptions. I always taught my students to always regard slopes as the measure of effect, and taught them a smidgen of calculus to enable calculation of slopes at various interesting points in the covariate space. Now, or course, -margins makes this easy-. For a humorous view in this direction, albeit in the context of nonlinear models and continuous covariates, see this article by an old friend of mine:

          Roncek, D.W., 1993. When Will They Ever Learn that First Derivatives Identify the Effects of Continuous Independent Variables or “Officer, You Can't Give Me a Ticket, I Wasn't Speeding for an Entire Hour”. Social Forces, 71(4), pp.1067-1078.

          The point is unnecessarily framed too narrowly in terms of continuous covariates, as the same kinds of principles apply in the context of conventional "ANOVA" approaches.

          Comment


          • #6
            Re #3.

            Well, I agree that the estimated marginal effect of frame C (=1) is of smaller magnitude under opposite alignment than under co-partisan or independent alignment. But if we look at the confidence intervals, it seems that they overlap a bit. So, if you want to know if that effect is smaller in the sense of a statistically significant difference, we have to go one step farther and run
            Code:
            margins alignment, dydx(frame) pwcompare
            As for whether I agree that Team member 1 is wrong, I can't really say. It's phrased in ordinary language, and there is no way to know what he or she means by the word "strengthening." It's meaningless. Or, to steal from Pauli, "it's not even wrong." But seriously, what team member 1 is saying is neither right nor wrong. It all hinges on the intended meaning of the ambiguous (in this context) word "strengthening," which could be either in agreement or disagreement with you depending on what each of you thinks it means here.

            Unless we want to completely prohibit statistical tests of natural-language hypotheses,...
            If I ruled the world, which, fortunately for the world, I do not, I would indeed prohibit this.

            On ordinary language, you’re absolutely right. Unfortunately we pre-registered the hypotheses in that form, as is common in our field.
            Well, since the ordinary language is susceptible of many mutually contradictory interpretations, you can report it any way you like and you won't even be wrong (thanks again, Wolfgang Pauli). In all seriousness, I would not attempt to report an ordinary language interpretation of these results. I would show the actual statistical estimates, and, if you are doing hypothesis tests I would show those as well. And to the extent I put them in words, I would restrict the description of comparisons to purely mathematical terminology and avoid words like "strengthen" or "weaken."

            Comment


            • #7
              Re#6

              Here is the significance test result, showing that confidence intervals for opposite vs independent (control) do not include zero, as I understand.
              HTML Code:
              ----------------------------------------------------------------------------------------------
                                                           |   Contrast Delta-method         Unadjusted
                                                           |      dy/dx   std. err.     [95% conf. interval]
              ---------------------------------------------+------------------------------------------------
              1.frame                                      |
                                                 alignment |
                 Co-partisan source vs Independent source  |  -.0078699   .1034016     -.2106212    .1948813
              Opposite party source vs Independent source  |   .2276461   .1024118      .0268355    .4284566
              Opposite party source vs Co-partisan source  |    .235516   .1036335        .03231     .438722
              And thank you very much for your engagement. I completely agree that things would have been better if we preregistered our hypotheses in mathematical notation for precision. But I am quite perplexed by your overall stance, which seems to imply that ordinary language can become math, but we can never travel in the other direction. This would make science communication impossible then, wouldn’t it?

              We conduct an experiment in the physical world, using materials entirely made of ordinary language, to test concepts that originated in ordinary language, we are allowed to translate the measurements into mathematical quantities to generate data input for statistical tests, but then we would be prohibited from translating these quantities back to ordinary language for the purpose of reporting the results? If this was a study of some health indicator, with medications instead of frames, and diets instead of source conditions, would we prohibit the researchers from concluding that the “effect of the tested medication is stronger under one particular diet compared to the other,” because stronger can mean virtually anything?

              But natural language does have constraints of its own, doesn’t it? Constraints grounded in convention, and the context of the domain-specific scientific literature it has been so far inserted into. “Strengthening” does have a finite range of possible meanings in this context, “effect” too—for example Mike Lacey here takes it as the slopes, so did I.

              I am also perplexed by the insistence that the truth value of various interpretative statements about our experiment cannot be distinguished from each other at all. Can you imagine any good-faith interpretation of the hypotheses above that would render them systematically testable and falsifiable and, in light of the data reported above, result with Team Member 1 being correct in saying that “opposite-party source strengthens C effect”? But we can imagine, without stretching, how they could be incorrect, can’t we?

              I promise that I don’t intend to torture you with more posts until I get the answer I want, Sir. I am just finding it difficult to take yours as a consistently held stance, rather than perhaps a warning exaggerated for pedagogical purposes, which I do appreciate.

              Comment


              • #8
                Just adding here C's unconditional effect on Qa, without interactions, to remind the context of the discussion, lest it be buried under my walls of text.
                Code:
                reg Qa ib3.frame ib2.oper i.order i.country2, cluster( ResponseId )
                HTML Code:
                             |               Robust
                          Qa | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                -------------+----------------------------------------------------------------
                       frame |
                          C  |  -.3631873   .0385492    -9.42   0.000    -.4387688   -.2876058
                          N  |  -.1823204   .0369146    -4.94   0.000     -.254697   -.1099438

                Comment


                • #9
                  I haven't read through this thread carefully, but based on the title of the thread this handout may be useful:

                  https://academicweb.nd.edu/~rwilliam/stats2/L53.pdf

                  The key idea is that, once you toss in interaction terms, the interpretations of the main effects totally changes.
                  -------------------------------------------
                  Richard Williams
                  Professor Emeritus of Sociology
                  University of Notre Dame
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://academicweb.nd.edu/~rwilliam/

                  Comment


                  • #10
                    Re #7

                    So, the tests for Opposite vs Independent and Opposite vs Co-Partisan alignment both have confidence intervals that exclude zero (although just barely). So the conclusion that the magnitude of the marginal effect of frame C is smallest in the presence of opposite alignment seems good.

                    Regarding your hypothetical medication study, I would object to the use of "stronger" just as much there as I am here. And I think that requiring the use of mathematical terminology, which is unambiguous, instead of ordinary, ambiguous, language enhances communication. And in my experience, that is what people usually do. I can recall lots of studies with conclusions like "Treatment X resulted in recovery n days sooner than treatment Y" or "Treatment X resulted in z mmHg greater reduction in blood pressure than treatment Y." You might get by with "stronger" in a context where none of the effects is it self negative. But once you use "stronger" and "weaker" with negative effects, confusion reigns, and different people understand the same sentence in opposite ways.

                    In your specific case, if you want to move a little bit in the direction of ordinary language, why not say that frame C reduces Qa by a smaller amount in the presence of opposite alignment than it does in independent or co-partisan alignment.

                    Comment


                    • #11
                      I think we can work with "marginal effect of frame C is smallest in the presence of opposite alignment" and "C reduces Qa by a smaller amount in the presence of opposite alignment than it does in independent or co-partisan alignment." Thank you very much, this was helpful!

                      Comment

                      Working...
                      X