Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computing p-values for interaction for third variables that are categorical?

    Hi! Thanks in advance for any help!

    I am working with a dataset and I am attempting to get a p-value showing interaction. I am working with Stata 12.1.
    I have a dichotomous outcome variable (0 or 1), a dichotomous predictor variable (0 or 1) and around 10-15 third variables that I want to add one by one to the outcome and predictor variables to see which ones creates interaction. I am able to do this with third variables that are also dichotomous but not for categorical variables with more than two possible values.

    For example, for a third variable that is dichotomous I have written the code:

    Code:
    glm ppd10new hiv##etoh if hiv<10 & case<3 & etoh<3, fam(poisson) link(log) nolog robust vce(cluster idno) eform
    My output from this code is:
    Code:
    Generalized linear models                          No. of obs      =      1923
    Optimization     : ML                              Residual df     =      1919
                                                       Scale parameter =         1
    Deviance         =  948.6268593                    (1/df) Deviance =   .494334
    Pearson          =          578                    (1/df) Pearson  =  .3011985
    
    Variance function: V(u) = u                        [Poisson]
    Link function    : g(u) = ln(u)                    [Log]
    
                                                       AIC             =  1.896322
    Log pseudolikelihood =  -1819.31343                BIC             = -13562.16
    
                                     (Std. Err. adjusted for 496 clusters in idno)
    ------------------------------------------------------------------------------
                 |               Robust
        ppd10new |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
       1.hiv |   .8793929   .0435436    -2.60   0.009     .7980595    .9690154
          1.etoh |   1.200831   .0563385     3.90   0.000     1.095334    1.316488
                 |
    hiv#etoh |
            1 1  |   1.091333   .0753802     1.27   0.206     .9531548    1.249543
                 |
           _cons |   .7176603   .0213292   -11.16   0.000     .6770501    .7607063
    ------------------------------------------------------------------------------
    From this, I can conclude (if I am interpreting correctly) that my p-value for interaction is 0.206.

    When I do something similar for a third variable that is categorical with more than two possible values (age group in this case) I write a very similar code:

    Code:
    glm ppd10new hiv##agegroupnew if hiv<10 & case<3 & agegroupnew<8, fam(poisson) link(log) nolog robust vce(cluster idno) eform
    And I get the following output:

    Code:
    Generalized linear models                          No. of obs      =      1933
    Optimization     : ML                              Residual df     =      1921
                                                       Scale parameter =         1
    Deviance         =  947.6434599                    (1/df) Deviance =  .4933074
    Pearson          =          581                    (1/df) Pearson  =  .3024466
    
    Variance function: V(u) = u                        [Poisson]
    Link function    : g(u) = ln(u)                    [Log]
    
                                                       AIC             =  1.901523
    Log pseudolikelihood =  -1825.82173                BIC             = -13588.23
    
                                            (Std. Err. adjusted for 497 clusters in idno)
    -------------------------------------------------------------------------------------
                        |               Robust
               ppd10new |        IRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
              1.hiv |   .8406898   .0789281    -1.85   0.065     .6993922    1.010534
                        |
            agegroupnew |
                     2  |   1.093663   .0672336     1.46   0.145     .9695167    1.233706
                     3  |   1.266441   .0693771     4.31   0.000      1.13751    1.409986
                     4  |   1.222377   .0881176     2.79   0.005     1.061315     1.40788
                     5  |   1.188677     .09617     2.14   0.033     1.014372    1.392934
                     6  |   1.277771   .1003009     3.12   0.002      1.09556    1.490285
                        |
    hiv#agegroupnew |
                   1 2  |   1.064964   .1061675     0.63   0.528     .8759466    1.294769
                   1 3  |   1.025141   .1111335     0.23   0.819     .8289086    1.267829
                   1 4  |   1.101809   .1330067     0.80   0.422     .8696652     1.39592
                   1 5  |   1.190728   .1693237     1.23   0.220     .9010943    1.573458
                   1 6  |   1.264852   .1613439     1.84   0.065     .9850562    1.624122
                        |
                  _cons |   .6494024   .0321789    -8.71   0.000     .5892987    .7156361
    -------------------------------------------------------------------------------------

    I only want one p-value showing whether there is interaction between the three variables. Instead here I have a p-value for every level of the potential interaction variable (every age group level in this case).

    Is there a way that I can get one p-value showing if there is interaction between my dichotomous outcome variable, my dichotomous predictor variable, and a third categorical variable with many levels/categories similar to that I see in the first example above?



    Thanks so much for any help you can provide!! Have a great day!

    Leo


  • #2
    I think this will do what you want if you add run it after your regression:

    Code:
    levelsof agegroupnew if e(sample), local(levels)
    testparm 1.hiv#i(`levels').agegroupnew
    This will provide a joint test of the omnibus null hypothesis that all of the interaction effects of hiv with the agegroupnew variable are zero, so it generalizes the test of a single interaction term in the dichotomous case.

    If you are iterating some loop over variables adding them one at a time, just replace agegroupnew by the iterating variable in your loop.

    Comment


    • #3
      Thanks so much Clyde! That worked perfectly!

      Comment

      Working...
      X