Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extremely high IRR by including interaction term

    I get extremely large IRR (5225582) corresponding to immediate vaccine impact (vaxera) when I include an interaction term in my model.Though the interaction is significant.Does this mean this interaction is not good.
    My data and code is as follows:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int yoa float midyrpop str3 spn_serotype byte n str12 type float(time vaxera postslope) byte(_est_a _est_b)
    1999    33620 "19F" 0 "vaccine-type"  1 0 0 1 1
    2000    34505 "19F" 0 "vaccine-type"  2 0 0 1 1
    2001    35411 "19F" 2 "vaccine-type"  3 0 0 1 1
    2002    36340 "19F" 3 "vaccine-type"  4 0 0 1 1
    2003    39747 "19F" 0 "vaccine-type"  5 0 0 1 1
    2004    41985 "19F" 2 "vaccine-type"  6 0 0 1 1
    2005    42738 "19F" 2 "vaccine-type"  7 0 0 1 1
    2006    43916 "19F" 0 "vaccine-type"  8 0 0 1 1
    2007    44536 "19F" 1 "vaccine-type"  9 0 0 1 1
    2008    44820 "19F" 1 "vaccine-type" 10 0 0 1 1
    2009    46343 "19F" 0 "vaccine-type" 11 0 0 1 1
    2010    47714 "19F" 3 "vaccine-type" 12 0 0 1 1
    2012 40730.05 "19F" 2 "vaccine-type" 13 1 1 1 1
    2013  43214.2 "19F" 0 "vaccine-type" 14 1 2 1 1
    2014    47807 "19F" 0 "vaccine-type" 15 1 3 1 1
    2015    47921 "19F" 0 "vaccine-type" 16 1 4 1 1
    2016 42048.97 "19F" 0 "vaccine-type" 17 1 5 1 1
    2017 21958.98 "19F" 0 "vaccine-type" 18 1 6 1 1
    2018    46210 "19F" 0 "vaccine-type" 19 1 7 1 1
    2019    45972 "19F" 0 "vaccine-type" 20 1 8 1 1
    end
    Interaction model:
    glm n vaxera time postslope , link(log) family(nbinomial) vce(hac nwest 2) exp(midyrpop) eform

    postslope denote the interaction term and vaxera denote immediate vaccine impact. The IRR am worried about when I include interaction is the IRR corresponding to vaxera

    **Model without interaction
    glm n vaxera time , link(log) family(nbinomial) vce(hac nwest 2) exp(midyrpop) eform

  • #2
    Your variable postslope is not an interaction term. An interaction term is the product of two variables. It appears to be similar to a vaxera#time interaction term, but is not exactly that.

    Be that as it may, we can treat it for discussion purposes as if it really were an interaction term, because it functions in a similar way. The important thing to understand about interaction models is that the coefficient of a constituent of the interaction does not represent the effect of that constituent. It can only be interpreted in conjunction with the interaction coefficient itself. If you look at the full output, your gargantuan IRR for vaxera is accompanied by an infinitesimally small IRR for postslope, and the product of the two is a number of moderate magnitude. That is, they more or less cancel each other out. Notice also that the constant term is extremely small, as well. So your model is trying to fit small numbers by putting together extremely large positive and extremely large negative coefficients. Sometimes things like this can be dealt with by centering the variables in the interaction--but in this case it doesn't help (I tried it on your example data).

    By the way, I also note that if you get rid of the fancy newey-west VCE, the model simply fails to converge outright. Not surprising given the data.

    This model is being stretched beyond its useful limits. I think you need to consider some different way to model this data. One possibility is to just aggregate the pre- and post- vaxera data and do a simple chi square or Fisher's exact test of the proportion n/midyrpop (with midyrpop rounded to an integer), or something like that. Good luck--this is a difficult problem to model: rare events to start with going down to zero! Or maybe a linear probability model of n/midyrprop.

    Comment


    • #3
      Thanks Clyde.Actually am centering interaction at time immediately before the start of intervention i.e (13-12, 14-12 e.t.c) I was hoping I can keep the interaction since my main interest is to check slope change following vaccine introduction (possible serotype replacement). My model is giving me headache since segmented regression does not even seem to be a good fit and I just wonder if their is another way to asses slope change

      Comment


      • #4
        My model is giving me headache since segmented regression does not even seem to be a good fit and I just wonder if their is another way to asses slope change
        Looking at the outcomes following the intervention, it is clear that the slope after intervention is indistinguishable from zero. You won't be able to fit that with maximum likelihood in any log-linked model because it requires estimating a slope that is infinitely negative. It will either fail to converge or, if you torture it into convergence somehow, it will give you useless results. You've already seen that.

        Personally, I think that the post-intervention results are so blatantly obvious that it is pointless to try any kind of fancy modeling. Nevertheless, if you must, I think your best chances with this, though I wouldn't call them good chances, would be to use a linear probability model, with n/midyrpop as the outcome variable, estimated with -mixed-. Or maybe you could get something out of your -nbreg- approach if you did it Bayesian with an informative prior that bounds the coefficients away from negative infinity. But given the strongly, clearly zero outcomes in the post-intervention years, that means adopting a prior that actually flies in the face of the evidence--not something I'd be at all comfortable with.

        Comment

        Working...
        X