Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiple pairwise comparisons after quadratic regression

    After running a standard linear regression, it is easy enough to run pairwise comparisons as ling as the indepvar is factorial. But I have a non-monotonic response for which a linear regression is inappropriate:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int dose double response float dose2
      0 7.14     0
      0 6.89     0
      0 9.09     0
      0 7.83     0
      0 9.47     0
     10 8.75   100
     10 7.91   100
     10 9.93   100
     10 9.84   100
     10 8.37   100
     50 10.2  2500
     50 10.7  2500
     50 10.5  2500
     50 8.43  2500
     50 10.9  2500
    100 11.1 10000
    100 10.6 10000
    100 10.6 10000
    100 10.9 10000
    100 10.9 10000
    200 10.4 40000
    200   11 40000
    200   10 40000
    200 9.23 40000
    200 10.1 40000
    end
    Here, dose2 is simply dose^2. Dose could be seen as both an ordinal and continuous variable.

    Quadratic regression improves the fit over the linear model:
    Code:
    . regress response dose
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(1, 23)        =      9.13
           Model |  10.8728249         1  10.8728249   Prob > F        =    0.0061
        Residual |  27.3822391        23  1.19053214   R-squared       =    0.2842
    -------------+----------------------------------   Adj R-squared   =    0.2531
           Total |   38.255064        24    1.593961   Root MSE        =    1.0911
    
    ------------------------------------------------------------------------------
        response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            dose |    .009028   .0029874     3.02   0.006     .0028481    .0152079
           _cons |   8.981181   .3064083    29.31   0.000     8.347328    9.615035
    ------------------------------------------------------------------------------
    
    . regress response c.dose##c.dose
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(2, 22)        =     17.14
           Model |  23.2990371         2  11.6495185   Prob > F        =    0.0000
        Residual |  14.9560269        22  .679819406   R-squared       =    0.6090
    -------------+----------------------------------   Adj R-squared   =    0.5735
           Total |   38.255064        24    1.593961   Root MSE        =    .82451
    
    -------------------------------------------------------------------------------
         response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    --------------+----------------------------------------------------------------
             dose |   .0433276   .0083342     5.20   0.000     .0260436    .0606116
                  |
    c.dose#c.dose |  -.0001714   .0000401    -4.28   0.000    -.0002546   -.0000883
                  |
            _cons |   8.314824   .2791116    29.79   0.000     7.735982    8.893666
    -------------------------------------------------------------------------------
    This difference is significant:
    Code:
    . nestreg: regress response dose dose2
    
    Block  1: dose
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(1, 23)        =      9.13
           Model |  10.8728249         1  10.8728249   Prob > F        =    0.0061
        Residual |  27.3822391        23  1.19053214   R-squared       =    0.2842
    -------------+----------------------------------   Adj R-squared   =    0.2531
           Total |   38.255064        24    1.593961   Root MSE        =    1.0911
    
    ------------------------------------------------------------------------------
        response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            dose |    .009028   .0029874     3.02   0.006     .0028481    .0152079
           _cons |   8.981181   .3064083    29.31   0.000     8.347328    9.615035
    ------------------------------------------------------------------------------
    
    Block  2: dose2
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(2, 22)        =     17.14
           Model |  23.2990371         2  11.6495185   Prob > F        =    0.0000
        Residual |  14.9560269        22  .679819406   R-squared       =    0.6090
    -------------+----------------------------------   Adj R-squared   =    0.5735
           Total |   38.255064        24    1.593961   Root MSE        =    .82451
    
    ------------------------------------------------------------------------------
        response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            dose |   .0433276   .0083342     5.20   0.000     .0260436    .0606116
              dose2 |  -.0001714   .0000401    -4.28   0.000    -.0002546   -.0000883
           _cons |   8.314824   .2791116    29.79   0.000     7.735982    8.893666
    ------------------------------------------------------------------------------
    
    
      +-------------------------------------------------------------+
      |       |          Block  Residual                     Change |
      | Block |       F     df        df   Pr > F       R2    in R2 |
      |-------+-----------------------------------------------------|
      |     1 |    9.13      1        23   0.0061   0.2842          |
      |     2 |   18.28      1        22   0.0003   0.6090   0.3248 |
      +-------------------------------------------------------------+
    But is it possible to now run a pairwise comparison? I tried running the quadratic with both dose and dose2 as factor variables, but the output was no different to that of linear regression against i.dose due to colinearity. Should I just be satisfied with identifying the trend, without pairwise comparison between groups? (Please note, I'm interested in comparing the means of measured values, not predicted means).
    Stata 14.2MP
    OS X

  • #2
    Originally posted by Nigel Moore View Post
    Should I just be satisfied with identifying the trend, without pairwise comparison between groups? (Please note, I'm interested in comparing the means of measured values, not predicted means).
    No, you can do both: identify a linear and quadratic relationship and do pairwise comparisons of the means of measured values, all in one regression using dose as a factor variable. See below.
    Code:
    version 15.0
    
    clear *
    
    quietly input int dose double response float dose2
      0 7.14     0
      0 6.89     0
      0 9.09     0
      0 7.83     0
      0 9.47     0
     10 8.75   100
     10 7.91   100
     10 9.93   100
     10 9.84   100
     10 8.37   100
     50 10.2  2500
     50 10.7  2500
     50 10.5  2500
     50 8.43  2500
     50 10.9  2500
    100 11.1 10000
    100 10.6 10000
    100 10.6 10000
    100 10.9 10000
    100 10.9 10000
    200 10.4 40000
    200   11 40000
    200   10 40000
    200 9.23 40000
    200 10.1 40000
    end
    
    drop dose2
    
    table dose, contents(mean response) // For reference in -pwcompare- below
    
    regress response i.dose
    contrast p.dose // Here's your quadratic
    
    //  "(Please note, I'm interested in comparing the means of measured values, not predicted means). "
    pwcompare dose, pveffects mcompare(dunnett) // If comparisons to zero-dose control group--these _are_ comparisons of the means
    pwcompare dose, pveffects mcompare(bonferroni) // Otherwise--again, because of use of factor variable, these _are also_ comparisons of the means
    
    exit
    I don't quite understand the statement, "I tried running the quadratic with both dose and dose2 as factor variables, but the output was no different to that of linear regression against i.dose due to colinearity." Maybe the output was similar because there isn't much beyond the quadratic (see output from -contrast- command above). I don't see where colinearity comes into play here.

    Comment


    • #3
      Thank you, Joseph, that's really helpful!

      The regression is just linear, correct? I had assumed that -contrast- only examined the previously run regression. So what we're doing here is running -pwcompare- after a linear regression. I guess that males sense, the pairwise comparisons are between groups, and are independent of the shape of the regression curve.

      Originally posted by Joseph Coveney View Post
      I don't quite understand the statement, "I tried running the quadratic with both dose and dose2 as factor variables, but the output was no different to that of linear regression against i.dose due to colinearity." Maybe the output was similar because there isn't much beyond the quadratic (see output from -contrast- command above). I don't see where colinearity comes into play here.
      This is the linear regression:

      Code:
      . regress response i.dose
      
            Source |       SS           df       MS      Number of obs   =        25
      -------------+----------------------------------   F(4, 20)        =      8.36
             Model |   23.938104         4    5.984526   Prob > F        =    0.0004
          Residual |    14.31696        20     .715848   R-squared       =    0.6257
      -------------+----------------------------------   Adj R-squared   =    0.5509
             Total |   38.255064        24    1.593961   Root MSE        =    .84608
      
      ------------------------------------------------------------------------------
          response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              dose |
               10  |       .876   .5351067     1.64   0.117    -.2402131    1.992213
               50  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
              100  |      2.736   .5351067     5.11   0.000     1.619787    3.852213
              200  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
                   |
             _cons |      8.084   .3783776    21.36   0.000     7.294718    8.873282
      ------------------------------------------------------------------------------
      And the quadratic:

      Code:
      . regress response i.dose i.dose2
      note: 100.dose2 omitted because of collinearity
      note: 2500.dose2 omitted because of collinearity
      note: 10000.dose2 omitted because of collinearity
      note: 40000.dose2 omitted because of collinearity
      
            Source |       SS           df       MS      Number of obs   =        25
      -------------+----------------------------------   F(4, 20)        =      8.36
             Model |   23.938104         4    5.984526   Prob > F        =    0.0004
          Residual |    14.31696        20     .715848   R-squared       =    0.6257
      -------------+----------------------------------   Adj R-squared   =    0.5509
             Total |   38.255064        24    1.593961   Root MSE        =    .84608
      
      ------------------------------------------------------------------------------
          response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
              dose |
               10  |       .876   .5351067     1.64   0.117    -.2402131    1.992213
               50  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
              100  |      2.736   .5351067     5.11   0.000     1.619787    3.852213
              200  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
                   |
             dose2 |
              100  |          0  (omitted)
             2500  |          0  (omitted)
            10000  |          0  (omitted)
            40000  |          0  (omitted)
                   |
             _cons |      8.084   .3783776    21.36   0.000     7.294718    8.873282
      ------------------------------------------------------------------------------
      They appear identical in all aspects, except that the second has the dose2 items omitted.
      Stata 14.2MP
      OS X

      Comment


      • #4
        The variable ‘dose’ being categorical, there is no sense in creating a squared term of it.
        Best regards,

        Marcos

        Comment


        • #5
          Originally posted by Marcos Almeida View Post
          The variable ‘dose’ being categorical, there is no sense in creating a squared term of it.
          But it's not simply categorical, I think. It's the amount of drug applied, so in that case it is both ordinal and continuous. Although the dose levels are discrete (there is nothing between 50 and 100, for example), a dose of 200 is double that of 100 and four times that of 50.

          Does that make a difference?
          Stata 14.2MP
          OS X

          Comment


          • #6
            Simply put, it shall be taken as a categorical variable. Point. When we use factor notation, we undoubtedly "tell" Stata it is a categorical variable.

            Please take some time to grasp the core information about categorical variables. It is found in any decent textbook on stats. In short, what categorization really "means" is not at all related to the "mean" value and the likes, for example, or even the concept of "value", so to speak. However basic it might be, it is a crucial step.
            Best regards,

            Marcos

            Comment


            • #7
              Originally posted by Nigel Moore View Post
              The regression is just linear, correct?
              regress fits linear models, but the functional relationship between response and dose is nonlinear. You can think of the model (when using a factor variable for the dose) as a connect-the-dots fitting of the nonlinear dose-response relationship.

              Originally posted by Nigel Moore View Post
              I had assumed that -contrast- only examined the previously run regression.
              It does. And it did in my example.

              Originally posted by Nigel Moore View Post
              This is the linear regression:

              Code:
              regress response i.dose
              And the quadratic:

              Code:
              regress response i.dose i.dose2
              I see now what you were doing. I had thought that you compared
              Code:
              regress response c.dose##c.dose
              to
              Code:
              regress response i.dose
              and found them to be identical.

              Comment


              • #8
                Hi Nigel. If I understand what you are trying to do, you may find it beneficial to study the following example.

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input int dose double response float dose2
                  0 7.14     0
                  0 6.89     0
                  0 9.09     0
                  0 7.83     0
                  0 9.47     0
                 10 8.75   100
                 10 7.91   100
                 10 9.93   100
                 10 9.84   100
                 10 8.37   100
                 50 10.2  2500
                 50 10.7  2500
                 50 10.5  2500
                 50 8.43  2500
                 50 10.9  2500
                100 11.1 10000
                100 10.6 10000
                100 10.6 10000
                100 10.9 10000
                100 10.9 10000
                200 10.4 40000
                200   11 40000
                200   10 40000
                200 9.23 40000
                200 10.1 40000
                end
                
                * Method 1a:  Dose as categorical with polynomial trend analysis & Dunnett's
                regress response i.dose
                contrast p.dose, pveffects
                pwcompare dose, pveffects mcompare(dunnett)
                
                * Method 1b:  Same as 1a, but using -anova- rather than -regress
                anova response i.dose
                contrast p.dose, pveffects
                pwcompare dose, pveffects mcompare(dunnett)
                
                * Method 2:  Dose as continuous.
                * In this case, to get an F-test for the overall model that matches
                * those of 1a and 1b, we need to include the linear, quadratic,
                * cubic and quartic terms in the model (given that there are 5 doses).
                regress response c.dose##c.dose##c.dose##c.dose
                
                * Compare each dose to 0 via -lincom-
                local d = 10
                lincom ///  10 vs 0
                _b[dose]*`d' + ///
                _b[c.dose#c.dose]*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'
                
                local d = 50
                lincom ///  50 vs 0
                _b[dose]*`d' + ///
                _b[c.dose#c.dose]*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'
                
                local d = 100
                lincom ///  100 vs 0
                _b[dose]*`d' + ///
                _b[c.dose#c.dose]*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'
                
                local d = 200
                lincom ///  200 vs 0
                _b[dose]*`d' + ///
                _b[c.dose#c.dose]*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
                _b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'
                
                * Redo anova with dunnett's to compare the mean differences from lincom
                quietly anova response i.dose // Repeat analysis 1b quietly
                pwcompare dose, pveffects mcompare(dunnett)
                * Notice that the Contrast values in this last table match the
                * Coef. values from the -lincom- commands above.  The p-values
                * from -lincom- do not match those from -pwcompare- because -lincom-
                * is not implementing Dunnett's test.  If I use -margins- with the
                * r. contrast operator, I should get results that match those from
                * -lincom-.
                margins r.dose, contrast(nowald effects)
                HTH.
                --
                Bruce Weaver
                Email: [email protected]
                Version: Stata/MP 19.5 (Windows)

                Comment


                • #9
                  Originally posted by Marcos Almeida View Post
                  Simply put, it shall be taken as a categorical variable. Point. When we use factor notation, we undoubtedly "tell" Stata it is a categorical variable.

                  Please take some time to grasp the core information about categorical variables. It is found in any decent textbook on stats. In short, what categorization really "means" is not at all related to the "mean" value and the likes, for example, or even the concept of "value", so to speak. However basic it might be, it is a crucial step.
                  What I meant was that, since drug dose can be seen as both ordinal and continuous, both i.dose and c.dose could both equally apply. Therefore, conc##conc and c.conc##c.conc are relevant in this case.
                  Last edited by Nigel Moore; 10 Nov 2017, 04:31.
                  Stata 14.2MP
                  OS X

                  Comment


                  • #10
                    Originally posted by Joseph Coveney View Post
                    It does. And it did in my example.
                    That's where I got confused then, because this looked like a linear regression (I read that a quadratic would have dose2):

                    regress response i.dose
                    contrast p.dose // Here's your quadratic
                    I didn't see the quadratic term in the regression, although running -contrast- tested both quadratic and cubic.

                    Originally posted by Joseph Coveney View Post
                    You can think of the model (when using a factor variable for the dose) as a connect-the-dots fitting of the nonlinear dose-response relationship.
                    And that's why! It also tells me how I can run a Dunnett test on non-monotonic data. Thank you!
                    Last edited by Nigel Moore; 10 Nov 2017, 04:50.
                    Stata 14.2MP
                    OS X

                    Comment


                    • #11
                      Originally posted by Bruce Weaver View Post
                      Hi Nigel. If I understand what you are trying to do, you may find it beneficial to study the following example.

                      HTH.
                      Thank you, Bruce, that is also very helpful. If I read that correctly, you are also using -anova-, not just -regress, to analyse a polynomial.
                      Stata 14.2MP
                      OS X

                      Comment


                      • #12
                        Originally posted by Nigel Moore View Post

                        Thank you, Bruce, that is also very helpful. If I read that correctly, you are also using -anova-, not just -regress, to analyse a polynomial.
                        Hi Nigel. The point was that the results from anova are the same as the results from regress with i.dose. Why? Because one-way ANOVA is the same thing as OLS multiple linear regression with one categorical explanatory variable expressed as k-1 indicator variables (which is what i.dose does).

                        HTH.
                        --
                        Bruce Weaver
                        Email: [email protected]
                        Version: Stata/MP 19.5 (Windows)

                        Comment

                        Working...
                        X