Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Wald test to determine whether categorical x categorical interaction needed in binary logit with cluster robust VCE

    Hello,

    This question is related to another question I found on this forum.

    Like the author of that question, I am also testing two binary logistic regression models, both of which have vce(cluster) specified to account for intraclass clustering at the classroom level. In my case, I am trying to determine whether or not I should include a categorical x categorical interaction of the two categorical independent variables in my model. One of my independent variables (Phase) has three levels (0,1,2) ; the other (Semester) has two levels (0,1).

    I can't use a likelihood ratio chi2 test (--lrtest--) to test the significance of the interaction because of the clustering and so am trying to follow the advice which says to use a Wald test. However, I'm not sure whether I have run and interpreted the Wald test correctly since, if I'm not mistaken, my code is not really testing the overall significance of the interaction in the model but only the difference relative to the base (reference) level.

    Code:
    . logit DFW  i.PHASE_  i.SEMESTER_, vce(cluster STRM_SECT) allbase nolog or
    
    . estimates store a
    
    . logit DFW  i.PHASE_  i.SEMESTER_  PHASE_#SEMESTER_, vce(cluster STRM_SECT) allbase nolog or
    
    . test a
    a not found
    r(111);
    
    . test 1.PHASE_#1.SEMESTER_ 2.PHASE_#1.SEMESTER_
    
     ( 1)  [DFW]1.PHASE_#1.SEMESTER_ = 0
     ( 2)  [DFW]2.PHASE_#1.SEMESTER_ = 0
    
               chi2(  2) =    0.41
             Prob > chi2 =    0.8165
    I see that the Wald chi2 for the model with the interaction increased: (Wald chi2[5] = 49.58) versus (Wald chi2[3] = 20.15. The pseudo R2 also increased (although it's still quite low): 0.0174 versus 0.0172. The coefficient on Phase 3 is no longer significant.

    After reading another forum post (I seem to have lost the link, sorry!), I also explored the AIC and BIC (n=2412). I don't know how applicable they are in this instance, but I saw that the model without the interaction has slightly lower AIC and BIC values.

    Code:
    . * clear past estimates
    . est clear
    
    . * Model 0: Intercept only
    . quietly logit DFW, vce(cluster STRM_SECT) or
    . est store M0
    
    . * Model 1: PHASE added
    . quietly logit DFW i.PHASE_, vce(cluster STRM_SECT) or
    . est store M1
    
    . * Model 2: PHASE + SEMESTER
    . quietly logit DFW i.PHASE_ i.SEMESTER_, vce(cluster STRM_SECT) or
    . est store M2
    
    . * Model 3: PHASE + SEMESTER + PHASE#SEMESTER
    . quietly logit DFW i.PHASE_ i.SEMESTER_ i.PHASE_#i.SEMESTER_, vce(cluster STRM_SECT) or
    . est store M3
    
    .  * Table of results
    .  est table M0 M1 M2 M3, stats(chi2 df N aic bic rank) star(.05 .01 .001) eform varwidth(24) style(nolines)
    
    ------------------------------------------------------------------------------------------
                    Variable        M0              M1              M2              M3        
    ------------------------------------------------------------------------------------------
                      PHASE_  
                    Phase 2                     .79809394       .85430332       .92232566    
                    Phase 3                      .4898581**     .47872228***    .49277439    
                              
                   SEMESTER_  
                     Spring                                     1.6415463***      1.83125**  
                              
            PHASE_#SEMESTER_  
             Phase 2#Spring                                                     .82446964    
                              
            PHASE_#SEMESTER_  
             Phase 3#Spring                                                     .93510388    
                              
                       _cons    .15351506***    .20309051***    .16518016***    .15699659***  
    ------------------------------------------------------------------------------------------
                        chi2                     8.546371       20.150958       49.579609    
                          df                                                                  
                           N         2412            2412            2412            2412    
                         aic    1894.0142       1880.7794       1867.5571       1871.0909    
                         bic    1899.8024        1898.144         1890.71       1905.8202    
                        rank            1               3               4               6    
    ------------------------------------------------------------------------------------------
                                                         legend: * p<.05; ** p<.01; *** p<.001
    All in all, would I be correct in concluding that I don't need the interaction in the model? Is there another way I should be using the Wald test? (If so, what is the correct syntax?)

    Thanks in advance for your help!
    Last edited by Lauren Hirsh; 08 Nov 2015, 11:49. Reason: added tags

  • #2
    However, I'm not sure whether I have run and interpreted the Wald test correctly since, if I'm not mistaken, my code is not really testing the overall significance of the interaction in the model but only the difference relative to the base (reference) level
    You have run the Wald test correctly. When you interact a 3-level variable with a 2-level variable, you get (3-1)X(2-1) = 2 interaction terms. Both of those interaction terms show up in your Wald test, so you have completely tested the interaction.

    As for whether that means you can drop the interaction from your model, that would depend on more than just a verdict on statistical significance. An interaction effect can be very real but fail to attain statistical significance in a particular analysis for a number of reasons, among them being if some combination of levels of the two variables is rare. If the theory in your field says it has to be there, then you should leave it in regardless. If your understanding of the science underlying your analysis is such that there is strong reason to believe there should be interaction, then you should leave it in the model. And, even in the absence of a good theoretical motivation, if the regression coefficients for the two interaction terms are substantively large (in particular, if they are a substantial fraction of, or even as big as or greater than the corresponding main effects), then you should strongly consider leaving the interaction terms in.

    I think the only circumstance where statistical significance would be the sole determinant of whether to include an interaction term is when the purpose of the study was to determine whether an interaction exists or not, and then only if I were confident that my sample size was adequate (including large numbers of observations for each combination of levels of those variables) to power such a tes tand my outcome variable measured with minimal noise, and I was confident that my study was in all respects well designed and implemented.

    Comment


    • #3
      Many thanks for your helpful reply, Clyde. ​I hope it is alright if I ask a few follow-up questions to make sure I understand!

      An interaction effect can be very real but fail to attain statistical significance in a particular analysis for a number of reasons, among them being if some combination of levels of the two variables is rare.
      Hmm, I do wonder if this is the case here but am not certain. Is there a rule of thumb for determining if one or more of the combinations is rare enough to impact the significance? (A quick online search did not turn up any specific answers.)

      Below is a breakdown of the frequencies and percentages in each Semester/Phase combination.
      Code:
      .  tab3way PHASE_ DFW SEMESTER_, rowpct 
      
      Table entries are cell frequencies and row percentages
      Missing categories ignored
      
      --------------------------------------------------------
                |               Semester and DFW              
                | ------- Fall -------    ------ Spring ------
       CR Phase |  No (ABC)  Yes (DFW)     No (ABC)  Yes (DFW)
      ----------+---------------------------------------------
        Phase 1 |       293         46          160         46
                |     86.43      13.57        77.67      22.33
                | 
        Phase 2 |       808        117          247         54
                |     87.35      12.65        82.06      17.94
                | 
        Phase 3 |       349         27          234         31
                |     92.82       7.18        88.30      11.70
      --------------------------------------------------------
      If the theory in your field says it has to be there, then you should leave it in regardless. If your understanding of the science underlying your analysis is such that there is strong reason to believe there should be interaction, then you should leave it in the model.
      I'm unaware of a strong theoretical justification for leaving in the interaction--most of the course redesign assessment literature I have read just supports the inclusion of Semester as a factor--but this is something I will look into some more.

      And, even in the absence of a good theoretical motivation, if the regression coefficients for the two interaction terms are substantively large (in particular, if they are a substantial fraction of, or even as big as or greater than the corresponding main effects), then you should strongly consider leaving the interaction terms in.
      Regarding the coefficients, should I be looking at the corresponding main effects in the model without the interaction terms or the model with them? Either way, it looks like they are substantively large compared to the effect of Phase 2, but not compared to Phase 3 or to Spring. Looking at the results below, what would you advise?

      Code:
      . logit DFW i.PHASE_ i.SEMESTER_, vce(cluster STRM_SECT) nolog 
      
      Logistic regression                               Number of obs   =       2412
                                                        Wald chi2(3)    =      20.15
                                                        Prob > chi2     =     0.0002
      Log pseudolikelihood = -929.77857                 Pseudo R2       =     0.0172
      
                                   (Std. Err. adjusted for 33 clusters in STRM_SECT)
      ------------------------------------------------------------------------------
                   |               Robust
               DFW |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
            PHASE_ |
          Phase 2  |   -.157469   .1494057    -1.05   0.292    -.4502988    .1353608
          Phase 3  |  -.7366346   .2041975    -3.61   0.000    -1.136854   -.3364149
                   |
         SEMESTER_ |
           Spring  |   .4956387   .1456953     3.40   0.001     .2100812    .7811962
             _cons |  -1.800719   .1341319   -13.42   0.000    -2.063612   -1.537825
      ------------------------------------------------------------------------------
      
      . logit DFW i.PHASE_ i.SEMESTER_ PHASE_#SEMESTER_, vce(cluster STRM_SECT) nolog 
      
      Logistic regression                               Number of obs   =       2412
                                                        Wald chi2(5)    =      49.58
                                                        Prob > chi2     =     0.0000
      Log pseudolikelihood = -929.54546                 Pseudo R2       =     0.0174
      
                                       (Std. Err. adjusted for 33 clusters in STRM_SECT)
      ----------------------------------------------------------------------------------
                       |               Robust
                   DFW |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -----------------+----------------------------------------------------------------
                PHASE_ |
              Phase 2  |  -.0808569   .2089221    -0.39   0.699    -.4903367    .3286229
              Phase 3  |  -.7077038    .394844    -1.79   0.073    -1.481584    .0661762
                       |
             SEMESTER_ |
               Spring  |   .6049988   .2006505     3.02   0.003     .2117311    .9982665
                       |
      PHASE_#SEMESTER_ |
       Phase 2#Spring  |   -.193015   .3038102    -0.64   0.525    -.7884719     .402442
       Phase 3#Spring  |  -.0670976     .41312    -0.16   0.871    -.8767981    .7426028
                       |
                 _cons |  -1.851531    .175486   -10.55   0.000    -2.195478   -1.507585
      ----------------------------------------------------------------------------------

      Comment


      • #4
        While the distribution of PHASE is pretty lopsided, it looks like you have reasonable numbers of observations in each PHASE#SEMESTER combination (at least when you add up the DFW and ABC numbers), but the phase 3 counts with DFW outcome look a bit meager in both semesters, and those in phase 1 are not all that much better.

        The phase 2#Spring interaction is more than double the magnitude of the Phase 2 main effect.

        I have no way of knowing how much noise there is in your DFW outcome variable.

        So all in all, the purely statistical case for dropping the interaction is weak. It would be best to say that your data are consistent with either a presence or an absence of interaction and your power to detect one large enough to matter for practical purposes is probably small. You've already said you're going to pursue the underlying science for further advice, and I would probably go with whatever conclusion you find there.

        But let me ask you another question. You said that the literature generally does not include an interaction term in these analyses. Why, then, did you chose to include one? Perhaps you had something in mind, something based on your understanding of the real-world process you are modeling. What was that? Was it just a lark? Or was there a reason you thought you could expect to find interaction here, even if it isn't normally a consideration? Does that reason still make sense to you now? If it does,

        FInally, there is another way of looking at it. The predicted probability of DFW = 1 conditional on PHASE = Phase2 and Semester = SPRING is, if I have done the calculations correctly (and I may not have), about 0.179 in the interaction model and 0.188 in the non-interaction model. [You can get correct numbers for these probabilities by running the -margins- command after each model] Is that difference in predicted outcome probability important enough to write home about? If not, you can probably drop the interaction term unless you come up with a scientific rationale for it; if so you should probably keep it.

        Comment


        • #5
          But let me ask you another question. You said that the literature generally does not include an interaction term in these analyses. Why, then, did you chose to include one? Perhaps you had something in mind, something based on your understanding of the real-world process you are modeling. What was that? Was it just a lark? Or was there a reason you thought you could expect to find interaction here, even if it isn't normally a consideration? Does that reason still make sense to you now? If it does,
          In retrospect, probably just a lark, however I had wondered whether underlying differences between the fall and spring student populations might lead them to respond in different ways to the same course design changes and was curious about the changes in the DFW proportions from Phase 1 to Phase 2 in the fall versus in the spring. I don't have much to go on though.

          The predicted probability of DFW = 1 conditional on PHASE = Phase2 and Semester = SPRING is, if I have done the calculations correctly (and I may not have), about 0.179 in the interaction model and 0.188 in the non-interaction model. [You can get correct numbers for these probabilities by running the -margins- command after each model] Is that difference in predicted outcome probability important enough to write home about?
          This is my first time using the margins command, but I got the same numbers you did. Yay! ;-)

          Code:
          .  margins 1.PHASE_, at(SEMESTER_=1) vce(unconditional)
          
          Adjusted predictions                              Number of obs   =       2412
          
          Expression   : Pr(DFW), predict()
          at           : SEMESTER_       =           1
          
                                       (Std. Err. adjusted for 33 clusters in STRM_SECT)
          ------------------------------------------------------------------------------
                       |            Unconditional
                       |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                PHASE_ |
              Phase 2  |   .1880778   .0223874     8.40   0.000     .1441994    .2319562
          ------------------------------------------------------------------------------
          
          
          .         margins 1.PHASE_, at(SEMESTER_=1) vce(unconditional)
          
          Adjusted predictions                              Number of obs   =       2412
          
          Expression   : Pr(DFW), predict()
          at           : SEMESTER_       =           1
          
                                       (Std. Err. adjusted for 33 clusters in STRM_SECT)
          ------------------------------------------------------------------------------
                       |            Unconditional
                       |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                PHASE_ |
              Phase 2  |    .179402   .0291426     6.16   0.000     .1222836    .2365204
          ------------------------------------------------------------------------------
          To answer your question, I don't believe that difference to be especially noteworthy. So, unless I find a theoretical motivation to the contrary, I will leave the interaction out.

          Thank you so much for your assistance!!,
          Lauren

          Comment


          • #6
            Lauren, if you are going to present a table of coefficients with significance star and without standard error I would also add to the table marginal significance (^p<.10). In M2 the main effect of Phase_3 is significant and in M3 the main effect lose its significance at all and the interaction of Phase3_semester didn't absorb it. As a reader, I would just like to know what is happening behind and SE or marginal significance criteria would be suffice (to me).

            Comment


            • #7
              Thanks for your input, Oded. I completely agree and intended to include the SEs.

              Comment

              Working...
              X