multiple pairwise comparisons after quadratic regression

Nigel Moore

Join Date: Apr 2016
Posts: 79

multiple pairwise comparisons after quadratic regression

07 Nov 2017, 06:13

After running a standard linear regression, it is easy enough to run pairwise comparisons as ling as the indepvar is factorial. But I have a non-monotonic response for which a linear regression is inappropriate:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int dose double response float dose2
  0 7.14     0
  0 6.89     0
  0 9.09     0
  0 7.83     0
  0 9.47     0
 10 8.75   100
 10 7.91   100
 10 9.93   100
 10 9.84   100
 10 8.37   100
 50 10.2  2500
 50 10.7  2500
 50 10.5  2500
 50 8.43  2500
 50 10.9  2500
100 11.1 10000
100 10.6 10000
100 10.6 10000
100 10.9 10000
100 10.9 10000
200 10.4 40000
200   11 40000
200   10 40000
200 9.23 40000
200 10.1 40000
end

Here, dose2 is simply dose^2. Dose could be seen as both an ordinal and continuous variable.

Quadratic regression improves the fit over the linear model:

Code:

. regress response dose

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(1, 23)        =      9.13
       Model |  10.8728249         1  10.8728249   Prob > F        =    0.0061
    Residual |  27.3822391        23  1.19053214   R-squared       =    0.2842
-------------+----------------------------------   Adj R-squared   =    0.2531
       Total |   38.255064        24    1.593961   Root MSE        =    1.0911

------------------------------------------------------------------------------
    response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dose |    .009028   .0029874     3.02   0.006     .0028481    .0152079
       _cons |   8.981181   .3064083    29.31   0.000     8.347328    9.615035
------------------------------------------------------------------------------

. regress response c.dose##c.dose

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(2, 22)        =     17.14
       Model |  23.2990371         2  11.6495185   Prob > F        =    0.0000
    Residual |  14.9560269        22  .679819406   R-squared       =    0.6090
-------------+----------------------------------   Adj R-squared   =    0.5735
       Total |   38.255064        24    1.593961   Root MSE        =    .82451

-------------------------------------------------------------------------------
     response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
         dose |   .0433276   .0083342     5.20   0.000     .0260436    .0606116
              |
c.dose#c.dose |  -.0001714   .0000401    -4.28   0.000    -.0002546   -.0000883
              |
        _cons |   8.314824   .2791116    29.79   0.000     7.735982    8.893666
-------------------------------------------------------------------------------

This difference is significant:

Code:

. nestreg: regress response dose dose2

Block  1: dose

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(1, 23)        =      9.13
       Model |  10.8728249         1  10.8728249   Prob > F        =    0.0061
    Residual |  27.3822391        23  1.19053214   R-squared       =    0.2842
-------------+----------------------------------   Adj R-squared   =    0.2531
       Total |   38.255064        24    1.593961   Root MSE        =    1.0911

------------------------------------------------------------------------------
    response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dose |    .009028   .0029874     3.02   0.006     .0028481    .0152079
       _cons |   8.981181   .3064083    29.31   0.000     8.347328    9.615035
------------------------------------------------------------------------------

Block  2: dose2

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(2, 22)        =     17.14
       Model |  23.2990371         2  11.6495185   Prob > F        =    0.0000
    Residual |  14.9560269        22  .679819406   R-squared       =    0.6090
-------------+----------------------------------   Adj R-squared   =    0.5735
       Total |   38.255064        24    1.593961   Root MSE        =    .82451

------------------------------------------------------------------------------
    response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dose |   .0433276   .0083342     5.20   0.000     .0260436    .0606116
          dose2 |  -.0001714   .0000401    -4.28   0.000    -.0002546   -.0000883
       _cons |   8.314824   .2791116    29.79   0.000     7.735982    8.893666
------------------------------------------------------------------------------


  +-------------------------------------------------------------+
  |       |          Block  Residual                     Change |
  | Block |       F     df        df   Pr > F       R2    in R2 |
  |-------+-----------------------------------------------------|
  |     1 |    9.13      1        23   0.0061   0.2842          |
  |     2 |   18.28      1        22   0.0003   0.6090   0.3248 |
  +-------------------------------------------------------------+

But is it possible to now run a pairwise comparison? I tried running the quadratic with both dose and dose2 as factor variables, but the output was no different to that of linear regression against i.dose due to colinearity. Should I just be satisfied with identifying the trend, without pairwise comparison between groups? (Please note, I'm interested in comparing the means of measured values, not predicted means).

Stata 14.2MP
OS X

Tags: None

Joseph Coveney

Join Date: Apr 2014
Posts: 4410

07 Nov 2017, 07:44

Originally posted by Nigel Moore View Post

Should I just be satisfied with identifying the trend, without pairwise comparison between groups? (Please note, I'm interested in comparing the means of measured values, not predicted means).

No, you can do both: identify a linear and quadratic relationship and do pairwise comparisons of the means of measured values, all in one regression using dose as a factor variable. See below.

Code:

version 15.0

clear *

quietly input int dose double response float dose2
  0 7.14     0
  0 6.89     0
  0 9.09     0
  0 7.83     0
  0 9.47     0
 10 8.75   100
 10 7.91   100
 10 9.93   100
 10 9.84   100
 10 8.37   100
 50 10.2  2500
 50 10.7  2500
 50 10.5  2500
 50 8.43  2500
 50 10.9  2500
100 11.1 10000
100 10.6 10000
100 10.6 10000
100 10.9 10000
100 10.9 10000
200 10.4 40000
200   11 40000
200   10 40000
200 9.23 40000
200 10.1 40000
end

drop dose2

table dose, contents(mean response) // For reference in -pwcompare- below

regress response i.dose
contrast p.dose // Here's your quadratic

//  "(Please note, I'm interested in comparing the means of measured values, not predicted means). "
pwcompare dose, pveffects mcompare(dunnett) // If comparisons to zero-dose control group--these _are_ comparisons of the means
pwcompare dose, pveffects mcompare(bonferroni) // Otherwise--again, because of use of factor variable, these _are also_ comparisons of the means

exit

I don't quite understand the statement, "I tried running the quadratic with both dose and dose2 as factor variables, but the output was no different to that of linear regression against i.dose due to colinearity." Maybe the output was similar because there isn't much beyond the quadratic (see output from -contrast- command above). I don't see where colinearity comes into play here.

Comment

Nigel Moore

Join Date: Apr 2016
Posts: 79

07 Nov 2017, 13:02

Thank you, Joseph, that's really helpful!

The regression is just linear, correct? I had assumed that -contrast- only examined the previously run regression. So what we're doing here is running -pwcompare- after a linear regression. I guess that males sense, the pairwise comparisons are between groups, and are independent of the shape of the regression curve.

Originally posted by Joseph Coveney View Post

This is the linear regression:

Code:

. regress response i.dose

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(4, 20)        =      8.36
       Model |   23.938104         4    5.984526   Prob > F        =    0.0004
    Residual |    14.31696        20     .715848   R-squared       =    0.6257
-------------+----------------------------------   Adj R-squared   =    0.5509
       Total |   38.255064        24    1.593961   Root MSE        =    .84608

------------------------------------------------------------------------------
    response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dose |
         10  |       .876   .5351067     1.64   0.117    -.2402131    1.992213
         50  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
        100  |      2.736   .5351067     5.11   0.000     1.619787    3.852213
        200  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
             |
       _cons |      8.084   .3783776    21.36   0.000     7.294718    8.873282
------------------------------------------------------------------------------

And the quadratic:

Code:

. regress response i.dose i.dose2
note: 100.dose2 omitted because of collinearity
note: 2500.dose2 omitted because of collinearity
note: 10000.dose2 omitted because of collinearity
note: 40000.dose2 omitted because of collinearity

      Source |       SS           df       MS      Number of obs   =        25
-------------+----------------------------------   F(4, 20)        =      8.36
       Model |   23.938104         4    5.984526   Prob > F        =    0.0004
    Residual |    14.31696        20     .715848   R-squared       =    0.6257
-------------+----------------------------------   Adj R-squared   =    0.5509
       Total |   38.255064        24    1.593961   Root MSE        =    .84608

------------------------------------------------------------------------------
    response |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        dose |
         10  |       .876   .5351067     1.64   0.117    -.2402131    1.992213
         50  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
        100  |      2.736   .5351067     5.11   0.000     1.619787    3.852213
        200  |      2.062   .5351067     3.85   0.001     .9457869    3.178213
             |
       dose2 |
        100  |          0  (omitted)
       2500  |          0  (omitted)
      10000  |          0  (omitted)
      40000  |          0  (omitted)
             |
       _cons |      8.084   .3783776    21.36   0.000     7.294718    8.873282
------------------------------------------------------------------------------

They appear identical in all aspects, except that the second has the dose2 items omitted.

Stata 14.2MP
OS X

Comment

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

08 Nov 2017, 01:55

The variable ‘dose’ being categorical, there is no sense in creating a squared term of it.

Best regards,

Marcos
Comment
Nigel Moore

Join Date: Apr 2016

Posts: 79
#5

08 Nov 2017, 04:03

Originally posted by Marcos Almeida View Post

The variable ‘dose’ being categorical, there is no sense in creating a squared term of it.

But it's not simply categorical, I think. It's the amount of drug applied, so in that case it is both ordinal and continuous. Although the dose levels are discrete (there is nothing between 50 and 100, for example), a dose of 200 is double that of 100 and four times that of 50.

Does that make a difference?

Stata 14.2MP
OS X
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

08 Nov 2017, 04:21

Simply put, it shall be taken as a categorical variable. Point. When we use factor notation, we undoubtedly "tell" Stata it is a categorical variable.

Please take some time to grasp the core information about categorical variables. It is found in any decent textbook on stats. In short, what categorization really "means" is not at all related to the "mean" value and the likes, for example, or even the concept of "value", so to speak. However basic it might be, it is a crucial step.

Best regards,

Marcos
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#7

08 Nov 2017, 04:28

Originally posted by Nigel Moore View Post

The regression is just linear, correct?

regress fits linear models, but the functional relationship between response and dose is nonlinear. You can think of the model (when using a factor variable for the dose) as a connect-the-dots fitting of the nonlinear dose-response relationship.

Originally posted by Nigel Moore View Post

I had assumed that -contrast- only examined the previously run regression.

It does. And it did in my example.

Originally posted by Nigel Moore View Post

This is the linear regression:

Code:

regress response i.dose

And the quadratic:

Code:

regress response i.dose i.dose2

I see now what you were doing. I had thought that you compared

Code:

regress response c.dose##c.dose

to

Code:

regress response i.dose

and found them to be identical.
1 like
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1133

08 Nov 2017, 15:36

Hi Nigel. If I understand what you are trying to do, you may find it beneficial to study the following example.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int dose double response float dose2
  0 7.14     0
  0 6.89     0
  0 9.09     0
  0 7.83     0
  0 9.47     0
 10 8.75   100
 10 7.91   100
 10 9.93   100
 10 9.84   100
 10 8.37   100
 50 10.2  2500
 50 10.7  2500
 50 10.5  2500
 50 8.43  2500
 50 10.9  2500
100 11.1 10000
100 10.6 10000
100 10.6 10000
100 10.9 10000
100 10.9 10000
200 10.4 40000
200   11 40000
200   10 40000
200 9.23 40000
200 10.1 40000
end

* Method 1a:  Dose as categorical with polynomial trend analysis & Dunnett's
regress response i.dose
contrast p.dose, pveffects
pwcompare dose, pveffects mcompare(dunnett)

* Method 1b:  Same as 1a, but using -anova- rather than -regress
anova response i.dose
contrast p.dose, pveffects
pwcompare dose, pveffects mcompare(dunnett)

* Method 2:  Dose as continuous.
* In this case, to get an F-test for the overall model that matches
* those of 1a and 1b, we need to include the linear, quadratic,
* cubic and quartic terms in the model (given that there are 5 doses).
regress response c.dose##c.dose##c.dose##c.dose

* Compare each dose to 0 via -lincom-
local d = 10
lincom ///  10 vs 0
_b[dose]*`d' + ///
_b[c.dose#c.dose]*`d'*`d' + ///
_b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
_b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'

local d = 50
lincom ///  50 vs 0
_b[dose]*`d' + ///
_b[c.dose#c.dose]*`d'*`d' + ///
_b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
_b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'

local d = 100
lincom ///  100 vs 0
_b[dose]*`d' + ///
_b[c.dose#c.dose]*`d'*`d' + ///
_b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
_b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'

local d = 200
lincom ///  200 vs 0
_b[dose]*`d' + ///
_b[c.dose#c.dose]*`d'*`d' + ///
_b[c.dose#c.dose#c.dose]*`d'*`d'*`d' + ///
_b[c.dose#c.dose#c.dose#c.dose]*`d'*`d'*`d'*`d'

* Redo anova with dunnett's to compare the mean differences from lincom
quietly anova response i.dose // Repeat analysis 1b quietly
pwcompare dose, pveffects mcompare(dunnett)
* Notice that the Contrast values in this last table match the
* Coef. values from the -lincom- commands above.  The p-values
* from -lincom- do not match those from -pwcompare- because -lincom-
* is not implementing Dunnett's test.  If I use -margins- with the
* r. contrast operator, I should get results that match those from
* -lincom-.
margins r.dose, contrast(nowald effects)

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)

Comment

Nigel Moore

Join Date: Apr 2016

Posts: 79
#9

10 Nov 2017, 04:18

Originally posted by Marcos Almeida View Post

Simply put, it shall be taken as a categorical variable. Point. When we use factor notation, we undoubtedly "tell" Stata it is a categorical variable.

Please take some time to grasp the core information about categorical variables. It is found in any decent textbook on stats. In short, what categorization really "means" is not at all related to the "mean" value and the likes, for example, or even the concept of "value", so to speak. However basic it might be, it is a crucial step.

What I meant was that, since drug dose can be seen as both ordinal and continuous, both i.dose and c.dose could both equally apply. Therefore, conc##conc and c.conc##c.conc are relevant in this case.

Last edited by Nigel Moore; 10 Nov 2017, 04:31.

Stata 14.2MP
OS X
Comment
Nigel Moore

Join Date: Apr 2016

Posts: 79
#10

10 Nov 2017, 04:42

Originally posted by Joseph Coveney View Post

It does. And it did in my example.

That's where I got confused then, because this looked like a linear regression (I read that a quadratic would have dose²):

regress response i.dose
contrast p.dose // Here's your quadratic

I didn't see the quadratic term in the regression, although running -contrast- tested both quadratic and cubic.

Originally posted by Joseph Coveney View Post

You can think of the model (when using a factor variable for the dose) as a connect-the-dots fitting of the nonlinear dose-response relationship.

And that's why! It also tells me how I can run a Dunnett test on non-monotonic data. Thank you!

Last edited by Nigel Moore; 10 Nov 2017, 04:50.

Stata 14.2MP
OS X
Comment
Nigel Moore

Join Date: Apr 2016

Posts: 79
#11

10 Nov 2017, 04:54

Originally posted by Bruce Weaver View Post

Hi Nigel. If I understand what you are trying to do, you may find it beneficial to study the following example.

HTH.

Thank you, Bruce, that is also very helpful. If I read that correctly, you are also using -anova-, not just -regress, to analyse a polynomial.

Stata 14.2MP
OS X
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#12

10 Nov 2017, 08:20

Originally posted by Nigel Moore View Post

Thank you, Bruce, that is also very helpful. If I read that correctly, you are also using -anova-, not just -regress, to analyse a polynomial.

Hi Nigel. The point was that the results from anova are the same as the results from regress with i.dose. Why? Because one-way ANOVA is the same thing as OLS multiple linear regression with one categorical explanatory variable expressed as k-1 indicator variables (which is what i.dose does).

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
1 like
Comment

Announcement

multiple pairwise comparisons after quadratic regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment