Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem with a simple nested anova

    I'm having trouble with a nested ANOVA on a fairly simple dataset. Concentrations of a chemical were measured in the blood and two organs of four individuals. I would like to compare conc in each matrix by id:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int id float(matrix conc)
     73 0 276
     75 0 214
     79 0 227
    121 0 241
     73 1 168
     75 1 144
     79 1 147
    121 1 154
     73 2 253
     75 2 195
     79 2 179
    121 2 224
    end
    If I run a plain vanilla -anova- I see that the concentration in one organ is different to blood, but not the other organ:

    Code:
    . anova conc matrix
    
                             Number of obs =         12    R-squared     =  0.7328
                             Root MSE      =    25.1319    Adj R-squared =  0.6735
    
                      Source | Partial SS         df         MS        F    Prob>F
                  -----------+----------------------------------------------------
                       Model |  15593.167          2   7796.5833     12.34  0.0026
                             |
                      matrix |  15593.167          2   7796.5833     12.34  0.0026
                             |
                    Residual |     5684.5          9   631.61111  
                  -----------+----------------------------------------------------
                       Total |  21277.667         11   1934.3333  
    
    . pwcompare matrix, pveffects mcompare(scheffe)
    
    Pairwise comparisons of marginal linear predictions
    
    Margins      : asbalanced
    
    ---------------------------
                 |    Number of
                 |  Comparisons
    -------------+-------------
          matrix |            3
    ---------------------------
    
    -----------------------------------------------------
                 |                             Scheffe
                 |   Contrast   Std. Err.      t    P>|t|
    -------------+---------------------------------------
          matrix |
         1 vs 0  |     -86.25   17.77092    -4.85   0.003
         2 vs 0  |     -26.75   17.77092    -1.51   0.364
         2 vs 1  |       59.5   17.77092     3.35   0.026
    -----------------------------------------------------
    However, since the matrices are nested in the individual, I should run a nested ANOVA:

    Code:
    . anova conc matrix / matrix|id /
    
                             Number of obs =         12    R-squared     =  1.0000
                             Root MSE      =          0    Adj R-squared =
    
                      Source | Partial SS         df         MS        F    Prob>F
                  -----------+----------------------------------------------------
                       Model |  21277.667         11   1934.3333  
                             |
                      matrix |  15593.167          2   7796.5833     12.34  0.0026
                   matrix|id |     5684.5          9   631.61111  
                  -----------+----------------------------------------------------
                   matrix|id |     5684.5          9   631.61111  
                             |
                    Residual |          0          0
                  -----------+----------------------------------------------------
                       Total |  21277.667         11   1934.3333  
    
    . pwcompare matrix, pveffects mcompare(scheffe)
    
    Pairwise comparisons of marginal linear predictions
    
    Margins      : asbalanced
    
    ---------------------------
                 |    Number of
                 |  Comparisons
    -------------+-------------
          matrix |            3
    ---------------------------
    
    -----------------------------------------------------
                 |                             Scheffe
                 |   Contrast   Std. Err.      t    P>|t|
    -------------+---------------------------------------
          matrix |
         1 vs 0  |     -86.25          .        .       .
         2 vs 0  |     -26.75          .        .       .
         2 vs 1  |       59.5          .        .       .
    -----------------------------------------------------
    That doesn't look right! There's obvuiously something wrong with that model. But with such a simple dataset, the model should also, presumably, be simple. Your suggestions would be appreciated!
    Last edited by Nigel Moore; 02 Dec 2017, 06:25.
    Stata 14.2MP
    OS X

  • #2
    One other consideration. I tried a -mixed- analysis with id as the repeated measures indicator, and that worked well showing that organ 2 was also different to blood. But I seem to recall that nested ANOVA is recommended over -mixed- for small datasets:

    Code:
    . mixed conc i.matrix || id:
    
    Performing EM optimization:
    
    Performing gradient-based optimization:
    
    Iteration 0:   log likelihood = -50.455837  
    Iteration 1:   log likelihood = -50.455837  
    
    Computing standard errors:
    
    Mixed-effects ML regression                     Number of obs     =         12
    Group variable: id                              Number of groups  =          4
    
                                                    Obs per group:
                                                                  min =          3
                                                                  avg =        3.0
                                                                  max =          3
    
                                                    Wald chi2(2)      =     125.31
    Log likelihood = -50.455837                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
            conc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          matrix |
              1  |     -86.25   7.887885   -10.93   0.000      -101.71   -70.79003
              2  |     -26.75   7.887885    -3.39   0.001    -42.20997   -11.29003
                 |
           _cons |      239.5   10.88242    22.01   0.000     218.1708    260.8292
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    id: Identity                 |
                      var(_cons) |   349.2711   277.0795      73.77305    1653.589
    -----------------------------+------------------------------------------------
                   var(Residual) |   124.4375   62.21872      46.70361    331.5521
    ------------------------------------------------------------------------------
    LR test vs. linear model: chibar2(01) = 7.07          Prob >= chibar2 = 0.0039
    
    . margins matrix
    
    Adjusted predictions                            Number of obs     =         12
    
    Expression   : Linear prediction, fixed portion, predict()
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          matrix |
              0  |      239.5   10.88242    22.01   0.000     218.1708    260.8292
              1  |     153.25   10.88242    14.08   0.000     131.9208    174.5792
              2  |     212.75   10.88242    19.55   0.000     191.4208    234.0792
    ------------------------------------------------------------------------------
    
    . pwcompare matrix, pveffects mcompare(scheffe)
    
    Pairwise comparisons of marginal linear predictions
    
    Margins      : asbalanced
    
    ---------------------------
                 |    Number of
                 |  Comparisons
    -------------+-------------
    conc         |
          matrix |            3
    ---------------------------
    
    -----------------------------------------------------
                 |                             Scheffe
                 |   Contrast   Std. Err.      z    P>|z|
    -------------+---------------------------------------
    conc         |
          matrix |
         1 vs 0  |     -86.25   7.887885   -10.93   0.000
         2 vs 0  |     -26.75   7.887885    -3.39   0.003
         2 vs 1  |       59.5   7.887885     7.54   0.000
    -----------------------------------------------------
    Stata 14.2MP
    OS X

    Comment


    • #3
      Are the two organs the same two organs in all patients? If so your design is a repeated measures with matrix crossed with subject.

      anova conc matrix id, repeated(matrix)
      pwcompare matrix

      Comment


      • #4
        I agree that this appears to be (what I would call) a one-factor repeated measures ANOVA (aka., a Treatment x Subjects design). I would not use pwcompare for this design, though, unless there is a way to make it use a different error term for every contrast. For repeated measures designs, there is no reason to expect the Treatment x Subjects interaction (i.e., the error term) to be the same for every pair of conditions, and so there is no good reason to use the overall error term from the ANOVA table. Many authors recommend using ordinary paired t-tests for the pair-wise contrasts.

        Also, note that when there are 3 conditions, carrying out all 3 pair-wise contrasts conditional of first observing a statistically significant omnibus F-test ensures that the family-wise alpha equals the per-contrast alpha. As Meier (2006) said, "Fisher's LSD procedure is known to preserve the experimentwise type I error rate at the nominal level of significance, if (and only if) the number of treatment groups is three." (See p. 41 in this chapter from an old edition of Dave Howell's Statistical Methods for Psychology for an explanation.) Granted, Meier was talking about pair-wise contrasts in a between-Ss ANOVA (using a pooled error term.) But I believe the same logic holds for the repeated measures designs.

        In the following example, I renamed Nigel's matrix variable to site (because matrix was causing problems on my sort command).

        Code:
        . * Example generated by -dataex-. To install: ssc install dataex
        . clear
        
        . input int id float(site conc)
        
                   id       site       conc
          1.  73 0 276
          2.  75 0 214
          3.  79 0 227
          4. 121 0 241
          5.  73 1 168
          6.  75 1 144
          7.  79 1 147
          8. 121 1 154
          9.  73 2 253
         10.  75 2 195
         11.  79 2 179
         12. 121 2 224
         13. end
        
        . sort id site
        
        . list, sepby(id)
        
             +-------------------+
             |  id   site   conc |
             |-------------------|
          1. |  73      0    276 |
          2. |  73      1    168 |
          3. |  73      2    253 |
             |-------------------|
          4. |  75      0    214 |
          5. |  75      1    144 |
          6. |  75      2    195 |
             |-------------------|
          7. |  79      0    227 |
          8. |  79      1    147 |
          9. |  79      2    179 |
             |-------------------|
         10. | 121      0    241 |
         11. | 121      1    154 |
         12. | 121      2    224 |
             +-------------------+
        
        .
        . * This appears to be a one-factor repeated measures ANOVA.
        . anova conc id site, repeated(site)
        
                                 Number of obs =         12    R-squared     =  0.9532
                                 Root MSE      =    12.8809    Adj R-squared =  0.9142
        
                          Source | Partial SS         df         MS        F    Prob>F
                      -----------+----------------------------------------------------
                           Model |  20282.167          5   4056.4333     24.45  0.0006
                                 |
                              id |       4689          3        1563      9.42  0.0109
                            site |  15593.167          2   7796.5833     46.99  0.0002
                                 |
                        Residual |      995.5          6   165.91667  
                      -----------+----------------------------------------------------
                           Total |  21277.667         11   1934.3333  
        
        
        Between-subjects error term:  id
                             Levels:  4         (3 df)
             Lowest b.s.e. variable:  id
        
        Repeated variable: site
                                                  Huynh-Feldt epsilon        =  1.2609
                                                  *Huynh-Feldt epsilon reset to 1.0000
                                                  Greenhouse-Geisser epsilon =  0.7333
                                                  Box's conservative epsilon =  0.5000
        
                                                    ------------ Prob > F ------------
                          Source |     df      F    Regular    H-F      G-G      Box
                      -----------+----------------------------------------------------
                            site |      2    46.99   0.0002   0.0002   0.0013   0.0064
                        Residual |      6
                      ----------------------------------------------------------------
        
        .
        . * When there are 3 groups, or 3 conditions as in this case,
        . * carrying out all 3 pair-wise contrasts conditional on a
        . * significant omnibus test preserves the family-wise alpha
        . * at the alpha level used for the omnibus test and for each
        . * of the pair-wise contrasts.  For between-Ss ANOVA, the
        . * pair-wise contrasts are carried out via modified t-tests
        . * that all use SQRT(MS_error) from the ANOVA table as the SE.
        . * But for repeated measures ANOVA, it does not make sense to
        . * use the MS_error from the ANOVA summary table, because there
        . * is no reason to expect the Treatment x Subjects interaction
        . * to be similar in nature across all pairs of conditions.  
        . * For that reason, some authors recommend using ordinary
        . * paired t-tests to make the pair-wise comparisons.
        . * Let's see what -pwcompare- does after the RM ANOVA done above.
        .
        . pwcompare site, effects
        
        Pairwise comparisons of marginal linear predictions
        
        Margins      : asbalanced
        
        ------------------------------------------------------------------------------
                     |                            Unadjusted           Unadjusted
                     |   Contrast   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                site |
             1 vs 0  |     -86.25   9.108147    -9.47   0.000    -108.5368   -63.96317
             2 vs 0  |     -26.75   9.108147    -2.94   0.026    -49.03683   -4.463168
             2 vs 1  |       59.5   9.108147     6.53   0.001     37.21317    81.78683
        ------------------------------------------------------------------------------
        
        .
        . * Notice that the SE is the same for all 3 contrasts.
        . * This is not what I want.  I don't know if there is a way
        . * to make -pwcompare- perform ordinary paired t-tests.
        . * If not, one can always reshape the dataset and do them
        . * the old-fashioned way.
        .
        . reshape wide conc, i(id) j(site)
        (note: j = 0 1 2)
        
        Data                               long   ->   wide
        -----------------------------------------------------------------------------
        Number of obs.                       12   ->       4
        Number of variables                   3   ->       4
        j variable (3 values)              site   ->   (dropped)
        xij variables:
                                           conc   ->   conc0 conc1 conc2
        -----------------------------------------------------------------------------
        
        . ttest conc1 == conc0
        
        Paired t test
        ------------------------------------------------------------------------------
        Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
        ---------+--------------------------------------------------------------------
           conc1 |       4      153.25     5.34439    10.68878    136.2418    170.2582
           conc0 |       4       239.5    13.35727    26.71454    196.9912    282.0088
        ---------+--------------------------------------------------------------------
            diff |       4      -86.25    8.045444    16.09089   -111.8542   -60.64581
        ------------------------------------------------------------------------------
             mean(diff) = mean(conc1 - conc0)                             t = -10.7204
         Ho: mean(diff) = 0                              degrees of freedom =        3
        
         Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
         Pr(T < t) = 0.0009         Pr(|T| > |t|) = 0.0017          Pr(T > t) = 0.9991
        
        . ttest conc2 == conc0
        
        Paired t test
        ------------------------------------------------------------------------------
        Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
        ---------+--------------------------------------------------------------------
           conc2 |       4      212.75    16.33185    32.66369    160.7748    264.7252
           conc0 |       4       239.5    13.35727    26.71454    196.9912    282.0088
        ---------+--------------------------------------------------------------------
            diff |       4      -26.75    7.192299     14.3846   -49.63911   -3.860894
        ------------------------------------------------------------------------------
             mean(diff) = mean(conc2 - conc0)                             t =  -3.7193
         Ho: mean(diff) = 0                              degrees of freedom =        3
        
         Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
         Pr(T < t) = 0.0169         Pr(|T| > |t|) = 0.0338          Pr(T > t) = 0.9831
        
        . ttest conc2 == conc1
        
        Paired t test
        ------------------------------------------------------------------------------
        Variable |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
        ---------+--------------------------------------------------------------------
           conc2 |       4      212.75    16.33185    32.66369    160.7748    264.7252
           conc1 |       4      153.25     5.34439    10.68878    136.2418    170.2582
        ---------+--------------------------------------------------------------------
            diff |       4        59.5    11.50724    23.01449    22.87881    96.12119
        ------------------------------------------------------------------------------
             mean(diff) = mean(conc2 - conc1)                             t =   5.1707
         Ho: mean(diff) = 0                              degrees of freedom =        3
        
         Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
         Pr(T < t) = 0.9930         Pr(|T| > |t|) = 0.0140          Pr(T > t) = 0.0070

        HTH.
        --
        Bruce Weaver
        Email: [email protected]
        Version: Stata/MP 18.5 (Windows)

        Comment


        • #5
          I agree about the t-test comments above for repeated measures ANOVA.

          Comment


          • #6
            I suggest approaching it more systematically, for example, to explicitly examine the covariance structure (see below) to see how reasonable the assumption that Bruce mentions in #4 is. In this case, the residual variance ranges from 100 to 1000, and so Bruce has a point. I've seen the same advice that Bruce mentions (individual t-tests) and, although it's easy to do a bunch of t-tests, you can also model the error covariance structure using mixed and then use the small-sample adjustments there to help accommodate the limitations in the dataset size. This allows all of the postestimation commands for mixed that you don't have with individual t-tests, such as the pwcompare postestimation command that was mentioned a couple of times in this thread.

            Because you have balanced data, you can also do the same with MANOVA. Again, the advantage of manova over a bunch of t-tests is the availability of postestimation commands, for example, the ability to test joint hypotheses (example below) that are more difficult or impossible to pull off with pairwise t-tests.
            Code:
            version 15.1
            
            clear *
            
            input int id  byte matrix int conc
             73 0 276
             75 0 214
             79 0 227
            121 0 241
             73 1 168
             75 1 144
             79 1 147
            121 1 154
             73 2 253
             75 2 195
             79 2 179
            121 2 224
            end
            
            *
            * Examination of residual error structure (& testing assumption Bruce brought up in #4)
            *
            mixed conc i.matrix || id:, noconstant residuals(unstructured, t(matrix)) nolrtest nolog
            estimates store Unstructured
            
            mixed conc i.matrix || id:, noconstant residuals(exchangeable) nolrtest nolog
            estimates store Exchangeable
            
            lrtest Unstructured Exchangeable
            
            estimates drop _all
            
            *
            * Modeling the residual error with small-sample adjustments using -mixed- (allows -pwcompare- etc.)
            *
            mixed conc i.matrix || id:, noconstant reml dfmethod(satterthwaite) residuals(unstructured, t(matrix)) nolrtest nolog
            pwcompare i.matrix, small effects
            
            *
            * Ditto using MANOVA (exact test statistics)
            *
            quietly reshape wide conc, i(id) j(matrix)
             
            generate byte k = 1.
            
            /* Begin message to StataCorp
            
               The following are new undesired behaviors with this mistaken syntax:
             
            manova conc0-conc2 = k
             
                (1) uninformative "error message"
                (2) attempt to lookup error yields "No entries found"
                (3) -capture noisily- doesn't display anything
            
               End message to StataCorp */
             
            *
            * Omnibus test
            *
            manova conc0-conc2 = k, noconstant
             
            *
            * Pairwise t-tests
            *
            matrix input Contrast01 = (1 -1 0)
            matrix input Contrast02 = (1 0 -1)
            matrix input Contrast12 = (0 -1 1)
             
            manovatest k, ytransform(Contrast01)
            manovatest k, ytransform(Contrast02)
            manovatest k, ytransform(Contrast12)
            
            *
            * Joint test (e.g., do the two organs differ from blood?)
            *
            matrix define Joint = Contrast01 \ Contrast02
            manovatest k, ytransform(Joint)
            
            exit
            Last edited by Joseph Coveney; 02 Dec 2017, 18:59.

            Comment

            Working...
            X