Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Confusion about interaction in GLM ((gamma) log (link))

    Hello,

    I’m having a problem with contradictory results in GLM Output and post estimation Wald test concerning an interaction between a 4-level categorical and a dichotomous variable.
    I’m using Stata 12.1 for Windows.

    I ran a glm as follows:

    Code:
    glm v_totalG ib0.LOC##i.banPUBL pe_drink fredrink gmcperCap if gender==1, fam(gamma) link(log) vce (cluster COUNTRY) eform
    My Output from this code is:



    Click image for larger version

Name:	Stata_Output_BG.png
Views:	1
Size:	18.7 KB
ID:	1337848



    In the output no single interaction term is significant. Even if I change the reference category, this remains the same.

    However, for some reason, the post estimation Wald test says that the overall interaction effect is significant. Therefore, at least one parameter beeing tested is not equal to zero.

    Is there any idea concerning a reason for this contradictory results?


    Thank you in advance for your help!

    Bettina

  • #2
    In the output no single interaction term is significant. Even if I change the reference category, this remains the same.
    That is right. Changing the reference category is just a re-parameterization of the model, an algebraic reshuffle of the matrix: no inferences based on that will change.

    However, for some reason, the post estimation Wald test says that the overall interaction effect is significant. Therefore, at least one parameter beeing tested is not equal to zero.

    Is there any idea concerning a reason for this contradictory results?
    These results are not contradictory. You cannot infer that one of the variables must be "significant" from the fact that the joint null hypothesis is rejected. The appearance of a contradiction arises from misunderstanding of what a significance test tells you. If you think of a significance test as distinguishing between "something is happening with this (these) variable(s)" and "nothing is happening with this (these) variable(s)" then it appears contradictory. But that is not what significance tests mean and that is not what p-values do.

    What the test of a single variable, when you declare the results significant, tells you is this: the magnitude of the point estimate generated from your data is sufficiently large that, given the level of (im)precision in the estimate that arises from sampling variation in the data, fewer than 5% of random samples would generate a magnitude this large if the true value of the parameter is zero.

    When you do a joint test on several variables, the message is analogous: the magnitudes of the point estimates generated from your data are, jointly, sufficiently large that, given the level of (im)precision in the estimates that arise from sampling variation in the data, fewer than 5% of random samples would generate a group of magnitudes this large if the true values of all of these parameters were zero.

    When viewed this way, you can see that the joint-significance test does not tell you anything directly about the value of any one of the parameters. It is entirely possible more than 5% of samples will generate individual values greater in magnitude than those you obtained if those individual parameters are (separately) zero, but fewer than 5% will generate an ensemble of values that are as far away as this set of estimates is from zero.

    Mathematically, the metric comparing a single value to zero is just |estimated value|. But for a joint test of several variables, the metric of joint distance from zero is more complicated and is like a weighted sum of squares and cross products of the estimates. (This is literally true for F-tests, but the idea is similar for other tests.) So again, the joint metric of distance from zero can be large enough to trigger "significance" even though none of the individual estimates is. Geometrically, a single variable significance test is equivalent to asking whether the confidence interval excludes zero. But a multi variable significance test is equivalent to asking whether the origin lies inside an oblique confidence ellipsoid.

    While this type of situation can arise with any joint test of significance of several parameter estimates, it is particularly common when the parameters concerned are all coefficients of a group of indicator (dummy) variables for a single construct, or a group of interaction variables.

    Comment


    • #3
      Dear Clyde,

      thank you very much for your fast and detailled answer!

      Bettina

      Comment

      Working...
      X