Robust: when to use it in a -mixed- regression?

Nigel Moore

Join Date: Apr 2016
Posts: 79

Robust: when to use it in a -mixed- regression?

28 Aug 2017, 13:32

The vce(robust) option allows SE to be derived that are robust to misspecification. But does the regression output give clues as to when it should be applied?

Here, I have run the same regression, with and without the option. The margins themselves are unchanged. But there are two changes to the SE: firstly, they get larger with the option included; secondly, the repeated-measures SE are no longer the same. Intuitively, the latter observation makes sense, since the SE of the measured data are not equal.

Code:

. mixed hr ib179.sb10##time c.ph || id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -435.33309  
Iteration 1:   log likelihood = -435.33077  
Iteration 2:   log likelihood = -435.33077  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        106
Group variable: id                              Number of groups  =         53

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(6)      =      84.12
Log likelihood = -435.33077                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
          hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        sb10 |
          0  |   54.66829   35.83739     1.53   0.127    -15.57171    124.9083
         15  |    37.3623    23.2042     1.61   0.107    -8.117095    82.84169
             |
      1.time |  -4.075213   2.822213    -1.44   0.149    -9.606648    1.456223
             |
   sb10#time |
        0 1  |  -36.44387   8.415938    -4.33   0.000    -52.93881   -19.94894
       15 1  |  -37.57096   7.329471    -5.13   0.000    -51.93646   -23.20546
             |
          ph |   35.28616   20.87117     1.69   0.091    -5.620595    76.19291
       _cons |  -70.74118   152.6358    -0.46   0.643    -369.9019    228.4196
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   80.96055    33.8667      35.66206    183.7979
-----------------------------+------------------------------------------------
               var(Residual) |   149.8417   29.23925      102.2196      219.65
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 6.85          Prob >= chibar2 = 0.0044

. margins time, over(sb10)

Predictive margins                              Number of obs     =        106

Expression   : Linear prediction, fixed portion, predict()
over         : sb10

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   sb10#time |
        0 0  |   186.5453   6.270516    29.75   0.000     174.2553    198.8352
        0 1  |   146.0262   6.270516    23.29   0.000     133.7362    158.3162
       15 0  |   188.8231   5.858661    32.23   0.000     177.3403    200.3059
       15 1  |   147.1769   5.858661    25.12   0.000     135.6941    158.6597
      179 0  |   187.7299   2.447076    76.72   0.000     182.9337    192.5261
      179 1  |   183.6547   2.447076    75.05   0.000     178.8585    188.4509
------------------------------------------------------------------------------

But, with robust analysis I get:

Code:

. mixed hr ib179.sb10##time c.ph || id:, vce(robust)

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log pseudolikelihood = -435.33309  
Iteration 1:   log pseudolikelihood = -435.33077  
Iteration 2:   log pseudolikelihood = -435.33077  

Computing standard errors:

Mixed-effects regression                        Number of obs     =        106
Group variable: id                              Number of groups  =         53

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(6)      =      37.04
Log pseudolikelihood = -435.33077               Prob > chi2       =     0.0000

                                    (Std. Err. adjusted for 53 clusters in id)
------------------------------------------------------------------------------
             |               Robust
          hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        sb10 |
          0  |   54.66829   44.73779     1.22   0.222    -33.01618    142.3527
         15  |    37.3623   27.34913     1.37   0.172    -16.24101     90.9656
             |
      1.time |  -4.075213   1.627908    -2.50   0.012    -7.265855   -.8845707
             |
   sb10#time |
        0 1  |  -36.44387   11.91491    -3.06   0.002    -59.79666   -13.09109
       15 1  |  -37.57096   14.48491    -2.59   0.009    -65.96086   -9.181053
             |
          ph |   35.28616   24.87588     1.42   0.156    -13.46967    84.04198
       _cons |  -70.74118   182.1527    -0.39   0.698    -427.7539    286.2715
------------------------------------------------------------------------------

------------------------------------------------------------------------------
                             |               Robust          
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   80.96055   34.62092      35.01682    187.1846
-----------------------------+------------------------------------------------
               var(Residual) |   149.8417   49.85253      78.06149    287.6263
------------------------------------------------------------------------------

. margins time, over(sb10)

Predictive margins                              Number of obs     =        106
Model VCE    : Robust

Expression   : Linear prediction, fixed portion, predict()
over         : sb10

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
   sb10#time |
        0 0  |   186.5453   5.986489    31.16   0.000      174.812    198.2786
        0 1  |   146.0262   8.348447    17.49   0.000     129.6635    162.3888
       15 0  |   188.8231   4.486814    42.08   0.000     180.0291    197.6171
       15 1  |   147.1769   12.58497    11.69   0.000     122.5108     171.843
      179 0  |   187.7299   2.196496    85.47   0.000     183.4249     192.035
      179 1  |   183.6547   2.058484    89.22   0.000     179.6201    187.6893
------------------------------------------------------------------------------

Which is correct, and why?

Last edited by Nigel Moore; 28 Aug 2017, 13:35.

Stata 14.2MP
OS X

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

29 Aug 2017, 01:14

Nigel:
robustified standard errors (SEs) do not affect point estimates.
As far as I can see from your results, the changes in SEs-dependednt mesures are quite negligible.
All in all, I would not say that your model cannot be informative without robust SEs.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Nigel Moore

Join Date: Apr 2016

Posts: 79
#3

29 Aug 2017, 02:28

Carlo

Thank you for your reply. Indeed, the point estimates do not change, but the SE do (some increase, some decrease). This latter effect means that some previously not significant differences now become significant. It also changes the look of plots that I have created with CI.

It also means that with one limited dataset I have run into the "Warning: variance matrix is nonsymmetric or highly singular" p[problem. But that is another matter entirely.

As I understand it, robustified SE protect against non-Gaussian distribution. But I thought that -mixed- was more tolerant of that than, say, -anova-, for which we have Bartlett's test and the like. Which leads me to the question as to what clues we have in the regression output to say whether the inherent tolerance of -mixed- is overwhelmed and the output needs to be robustified.

Or have I completely missed the point?

Stata 14.2MP
OS X
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

29 Aug 2017, 02:46

Nigel:
I think that two issues are worth considering:
- robustifing or not is, in my opinion, a minor issue: again, no relevant difference in significance hits my eyes when looking at the two regression outputs (hence, I see no clues that favor clustered standard errors, unless your residuals look dramatically heteroskedastic);
- a small sample size: that's more relevant, as long as you can obtain more data. If obtaining more data is unfeasible, it remains a matter of fact.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Nigel Moore

Join Date: Apr 2016
Posts: 79

29 Aug 2017, 04:39

Carlo

I hope that I didn't mislead by presenting the first, and possibly least changed, analysis from my model. The case below shows a slightly different context, one in which some statistically significant changes occur. The model constant changes from p=0.122 to p=0.015 and, moreover, the change in HR (my dependent variable) for 179#2#120 changes from p>0.05 to p<0.05:

Code:

. mixed hr conc10##treat##ib333.sb10##time c.ph if gd==11 & conc==12 || id:
note: 120.conc10 omitted because of collinearity
note: 120.conc10#2.treat omitted because of collinearity
note: 120.conc10#131.sb10 omitted because of collinearity
note: 120.conc10#143.sb10 omitted because of collinearity
note: 120.conc10#179.sb10 omitted because of collinearity
note: 1.treat#143.sb10 identifies no observations in the sample
note: 2.treat#131.sb10 identifies no observations in the sample
note: 2.treat#143.sb10 omitted because of collinearity
note: 120.conc10#1.treat#143.sb10 identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10 identifies no observations in the sample
note: 120.conc10#2.treat#143.sb10 omitted because of collinearity
note: 120.conc10#2.treat#179.sb10 omitted because of collinearity
note: 120.conc10#1.time omitted because of collinearity
note: 120.conc10#2.treat#1.time omitted because of collinearity
note: 120.conc10#131.sb10#1.time omitted because of collinearity
note: 120.conc10#143.sb10#1.time omitted because of collinearity
note: 120.conc10#179.sb10#1.time omitted because of collinearity
note: 1.treat#143.sb10#0.time identifies no observations in the sample
note: 1.treat#143.sb10#1.time identifies no observations in the sample
note: 2.treat#131.sb10#0.time identifies no observations in the sample
note: 2.treat#131.sb10#1.time identifies no observations in the sample
note: 2.treat#143.sb10#1.time omitted because of collinearity
note: 120.conc10#1.treat#143.sb10#0.time identifies no observations in the sample
note: 120.conc10#1.treat#143.sb10#1.time identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10#0.time identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10#1.time identifies no observations in the sample
note: 120.conc10#2.treat#143.sb10#1.time omitted because of collinearity
note: 120.conc10#2.treat#179.sb10#1.time omitted because of collinearity

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -454.86949  
Iteration 1:   log likelihood = -454.86674  
Iteration 2:   log likelihood = -454.86674  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        114
Group variable: id                              Number of groups  =         57

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(12)     =     216.11
Log likelihood = -454.86674                     Prob > chi2       =     0.0000

----------------------------------------------------------------------------------------
                    hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
            120.conc10 |          0  (omitted)
               2.treat |   5.491956   6.082552     0.90   0.367    -6.429627    17.41354
                       |
          conc10#treat |
                120 2  |          0  (omitted)
                       |
                  sb10 |
                  131  |   3.346835   22.90633     0.15   0.884    -41.54874    48.24241
                  143  |   1.245154   15.04047     0.08   0.934    -28.23363    30.72394
                  179  |   .3537231   10.65648     0.03   0.974     -20.5326    21.24004
                       |
           conc10#sb10 |
              120 131  |          0  (omitted)
              120 143  |          0  (omitted)
              120 179  |          0  (omitted)
                       |
            treat#sb10 |
                1 143  |          0  (empty)
                2 131  |          0  (empty)
                2 143  |          0  (omitted)
                2 179  |  -1.610051   8.534654    -0.19   0.850    -18.33767    15.11756
                       |
     conc10#treat#sb10 |
            120 1 143  |          0  (empty)
            120 2 131  |          0  (empty)
            120 2 143  |          0  (omitted)
            120 2 179  |          0  (omitted)
                       |
                1.time |  -13.54472   4.910127    -2.76   0.006    -23.16839   -3.921049
                       |
           conc10#time |
                120 1  |          0  (omitted)
                       |
            treat#time |
                  2 1  |  -.6442154   7.008604    -0.09   0.927    -14.38083     13.0924
                       |
     conc10#treat#time |
              120 2 1  |          0  (omitted)
                       |
             sb10#time |
                131 1  |  -43.80343   7.936117    -5.52   0.000    -59.35794   -28.24893
                143 1  |  -19.58241   6.923188    -2.83   0.005    -33.15161   -6.013212
                179 1  |   -2.72462   6.942433    -0.39   0.695    -16.33154     10.8823
                       |
      conc10#sb10#time |
            120 131 1  |          0  (omitted)
            120 143 1  |          0  (omitted)
            120 179 1  |          0  (omitted)
                       |
       treat#sb10#time |
              1 143 0  |          0  (empty)
              1 143 1  |          0  (empty)
              2 131 0  |          0  (empty)
              2 131 1  |          0  (empty)
              2 143 1  |          0  (omitted)
              2 179 1  |  -8.588449   9.878976    -0.87   0.385    -27.95089    10.77399
                       |
conc10#treat#sb10#time |
          120 1 143 0  |          0  (empty)
          120 1 143 1  |          0  (empty)
          120 2 131 0  |          0  (empty)
          120 2 131 1  |          0  (empty)
          120 2 143 1  |          0  (omitted)
          120 2 179 1  |          0  (omitted)
                       |
                    ph |  -2.512655   17.70787    -0.14   0.887    -37.21945    32.19414
                 _cons |   203.0313   131.3732     1.55   0.122     -54.4554    460.5181
----------------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   62.30034   25.51654      27.91652    139.0335
-----------------------------+------------------------------------------------
               var(Residual) |   119.7879     22.451      82.96184    172.9608
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 7.08          Prob >= chibar2 = 0.0039

. margins time, over(sb10 treat conc10)

Predictive margins                              Number of obs     =        114

Expression   : Linear prediction, fixed portion, predict()
over         : sb10 treat conc10

----------------------------------------------------------------------------------------
                       |            Delta-method
                       |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
sb10#treat#conc10#time |
          131 1 120 0  |   190.6741   5.260901    36.24   0.000     180.3629    200.9853
          131 1 120 1  |   133.3259   5.260901    25.34   0.000     123.0147    143.6371
          143 2 120 0  |   193.0857   4.342579    44.46   0.000     184.5744     201.597
          143 2 120 1  |   159.3143   4.342579    36.69   0.000      150.803    167.8256
          179 1 120 0  |   185.9347   4.291946    43.32   0.000     177.5226    194.3467
          179 1 120 1  |   169.6653   4.291946    39.53   0.000     161.2533    178.0774
          179 2 120 0  |    189.951   4.281128    44.37   0.000     181.5601    198.3419
          179 2 120 1  |    164.449   4.281128    38.41   0.000     156.0581    172.8399
          333 1 120 0  |   184.3724   4.271624    43.16   0.000     176.0001    192.7446
          333 1 120 1  |   170.8276   4.271624    39.99   0.000     162.4554    179.1999
          333 2 120 0  |   189.8945   4.331508    43.84   0.000     181.4049    198.3841
          333 2 120 1  |   175.7055   4.331508    40.56   0.000     167.2159    184.1951
----------------------------------------------------------------------------------------

With robustification:

Code:

. mixed hr conc10##treat##ib333.sb10##time c.ph if gd==11 & conc==12 || id:, vce(robust)
note: 120.conc10 omitted because of collinearity
note: 120.conc10#2.treat omitted because of collinearity
note: 120.conc10#131.sb10 omitted because of collinearity
note: 120.conc10#143.sb10 omitted because of collinearity
note: 120.conc10#179.sb10 omitted because of collinearity
note: 1.treat#143.sb10 identifies no observations in the sample
note: 2.treat#131.sb10 identifies no observations in the sample
note: 2.treat#143.sb10 omitted because of collinearity
note: 120.conc10#1.treat#143.sb10 identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10 identifies no observations in the sample
note: 120.conc10#2.treat#143.sb10 omitted because of collinearity
note: 120.conc10#2.treat#179.sb10 omitted because of collinearity
note: 120.conc10#1.time omitted because of collinearity
note: 120.conc10#2.treat#1.time omitted because of collinearity
note: 120.conc10#131.sb10#1.time omitted because of collinearity
note: 120.conc10#143.sb10#1.time omitted because of collinearity
note: 120.conc10#179.sb10#1.time omitted because of collinearity
note: 1.treat#143.sb10#0.time identifies no observations in the sample
note: 1.treat#143.sb10#1.time identifies no observations in the sample
note: 2.treat#131.sb10#0.time identifies no observations in the sample
note: 2.treat#131.sb10#1.time identifies no observations in the sample
note: 2.treat#143.sb10#1.time omitted because of collinearity
note: 120.conc10#1.treat#143.sb10#0.time identifies no observations in the sample
note: 120.conc10#1.treat#143.sb10#1.time identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10#0.time identifies no observations in the sample
note: 120.conc10#2.treat#131.sb10#1.time identifies no observations in the sample
note: 120.conc10#2.treat#143.sb10#1.time omitted because of collinearity
note: 120.conc10#2.treat#179.sb10#1.time omitted because of collinearity

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log pseudolikelihood = -454.86949  
Iteration 1:   log pseudolikelihood = -454.86674  
Iteration 2:   log pseudolikelihood = -454.86674  

Computing standard errors:

Mixed-effects regression                        Number of obs     =        114
Group variable: id                              Number of groups  =         57

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(12)     =     302.95
Log pseudolikelihood = -454.86674               Prob > chi2       =     0.0000

                                              (Std. Err. adjusted for 57 clusters in id)
----------------------------------------------------------------------------------------
                       |               Robust
                    hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
            120.conc10 |          0  (omitted)
               2.treat |   5.491956   4.775442     1.15   0.250    -3.867738    14.85165
                       |
          conc10#treat |
                120 2  |          0  (omitted)
                       |
                  sb10 |
                  131  |   3.346835   15.81877     0.21   0.832    -27.65738    34.35105
                  143  |   1.245154   10.07978     0.12   0.902    -18.51086    21.00117
                  179  |   .3537231   7.098563     0.05   0.960    -13.55921    14.26665
                       |
           conc10#sb10 |
              120 131  |          0  (omitted)
              120 143  |          0  (omitted)
              120 179  |          0  (omitted)
                       |
            treat#sb10 |
                1 143  |          0  (empty)
                2 131  |          0  (empty)
                2 143  |          0  (omitted)
                2 179  |  -1.610051   7.118917    -0.23   0.821    -15.56287    12.34277
                       |
     conc10#treat#sb10 |
            120 1 143  |          0  (empty)
            120 2 131  |          0  (empty)
            120 2 143  |          0  (omitted)
            120 2 179  |          0  (omitted)
                       |
                1.time |  -13.54472   2.922665    -4.63   0.000    -19.27304   -7.816404
                       |
           conc10#time |
                120 1  |          0  (omitted)
                       |
            treat#time |
                  2 1  |  -.6442154   4.297932    -0.15   0.881    -9.068008    7.779577
                       |
     conc10#treat#time |
              120 2 1  |          0  (omitted)
                       |
             sb10#time |
                131 1  |  -43.80343   12.33814    -3.55   0.000    -67.98575   -19.62112
                143 1  |  -19.58241   4.301532    -4.55   0.000    -28.01326   -11.15156
                179 1  |   -2.72462   6.381934    -0.43   0.669    -15.23298    9.783741
                       |
      conc10#sb10#time |
            120 131 1  |          0  (omitted)
            120 143 1  |          0  (omitted)
            120 179 1  |          0  (omitted)
                       |
       treat#sb10#time |
              1 143 0  |          0  (empty)
              1 143 1  |          0  (empty)
              2 131 0  |          0  (empty)
              2 131 1  |          0  (empty)
              2 143 1  |          0  (omitted)
              2 179 1  |  -8.588449   7.874498    -1.09   0.275    -24.02218    6.845283
                       |
conc10#treat#sb10#time |
          120 1 143 0  |          0  (empty)
          120 1 143 1  |          0  (empty)
          120 2 131 0  |          0  (empty)
          120 2 131 1  |          0  (empty)
          120 2 143 1  |          0  (omitted)
          120 2 179 1  |          0  (omitted)
                       |
                    ph |  -2.512655   11.25905    -0.22   0.823    -24.57999    19.55468
                 _cons |   203.0313   83.64416     2.43   0.015     39.09179    366.9709
----------------------------------------------------------------------------------------

------------------------------------------------------------------------------
                             |               Robust           
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   62.30034   27.94596      25.86237    150.0764
-----------------------------+------------------------------------------------
               var(Residual) |   119.7879   31.99139      70.97171    202.1812
------------------------------------------------------------------------------

. margins time, over(sb10 treat conc10)

Predictive margins                              Number of obs     =        114
Model VCE    : Robust

Expression   : Linear prediction, fixed portion, predict()
over         : sb10 treat conc10

----------------------------------------------------------------------------------------
                       |            Delta-method
                       |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------------+----------------------------------------------------------------
sb10#treat#conc10#time |
          131 1 120 0  |   190.6741   4.657938    40.94   0.000     181.5447    199.8035
          131 1 120 1  |   133.3259   8.663067    15.39   0.000     116.3466    150.3052
          143 2 120 0  |   193.0857   2.472733    78.09   0.000     188.2392    197.9321
          143 2 120 1  |   159.3143   4.058671    39.25   0.000     151.3595    167.2692
          179 1 120 0  |   185.9347   2.763752    67.28   0.000     180.5178    191.3515
          179 1 120 1  |   169.6653   5.087959    33.35   0.000     159.6931    179.6375
          179 2 120 0  |    189.951   4.544312    41.80   0.000     181.0443    198.8577
          179 2 120 1  |    164.449   5.130098    32.06   0.000     154.3942    174.5038
          333 1 120 0  |   184.3724   3.102716    59.42   0.000     178.2911    190.4536
          333 1 120 1  |   170.8276   4.237687    40.31   0.000     162.5219    179.1334
          333 2 120 0  |   189.8945   3.613948    52.54   0.000     182.8113    196.9777
          333 2 120 1  |   175.7055   4.878712    36.01   0.000     166.1434    185.2676
----------------------------------------------------------------------------------------

I'm not trying to data-dredge (I have enough significant differences to make a report!) but I would like to know that, since some of my groups are skewed and others are sparse, that I'm doing the right thing, one way or another.

But not robustifying would avoid that "variance matrix is nonsymmetric or highly singular" warning!

Stata 14.2MP
OS X

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

29 Aug 2017, 06:50

Nigel:
again, I do not see any relevant difference between the two models.
As an aside, with 114 observations and 12 parameters, I wonder whether you're asking too much out of yuor data.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#7

29 Aug 2017, 06:58

Just a side note after Carlo's helpful replies. It seems the sample size is quite small to "support" so many interactions terms. Also, there are issues related to collinearity and lack of observations under several groupings. This is to say that, perhaps, these aspects are mostly related to differences found under both models, rather than the use (or not) of robust vce.

Crossed with Carlo's message, also about the small sample issue.

Best regards,

Marcos
1 like
Comment

Nigel Moore

Join Date: Apr 2016
Posts: 79

29 Aug 2017, 07:43

Thank you both, gentlemen. The reason that there are several missing observations is that we're comparing the effect of two drugs on embryonic heart rate at different pH levels. The pH levels were established by the concentration of sodium bicarbonate (sb), but the same concentrations were not used in both cases: drug 1 was tested with concentrations of 13.1mM, 17.9mM, and 33.3mM, while drug 2 was tested with 14.3mM, 17.9mM, and 33.3mM.

So, when Stata tries to evaluate drug 1 at 14.3mM sb and drug 2 at 13.1mM sb, it's going to find a lack of data. I'm not sure that there's anything I can do about that.

As to collinearity, that may be related to the fact that, while pH bands are set with sodium bicarbonate, there is a small fluctuation of pH within samples. Since this could affect the outcome, I have included measured pH as a covariate (c.ph). The -margins- command uses sb10 (ten times sb, to give a factor variable) to tell Stata when samples are banded together in one pH group. If that makes sense.

My data (for the above -margins- run) are here, if you're interested:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id byte(time gd treat) double(vol conc sb) int hr double ph float(sb10 conc10)
8966 0 11 1   4 12 17.9 188    7 179 120
8966 1 11 1   4 12 17.9 180 7.13 179 120
8970 0 11 1   4 12 17.9 184 7.01 179 120
8970 1 11 1   4 12 17.9 200 7.09 179 120
8975 0 11 1   4 12 17.9 184 7.01 179 120
8975 1 11 1   4 12 17.9 160 7.06 179 120
9223 0 11 1 2.5 12 33.3 180 7.38 333 120
9223 1 11 1 2.5 12 33.3 156 7.44 333 120
9226 0 11 1 2.5 12 33.3 164  7.4 333 120
9226 1 11 1 2.5 12 33.3 152 7.43 333 120
9229 0 11 1 2.5 12 33.3 188 7.42 333 120
9229 1 11 1 2.5 12 33.3 156 7.44 333 120
9232 0 11 1 2.5 12 33.3 196 7.42 333 120
9232 1 11 1 2.5 12 33.3 192 7.45 333 120
9235 0 11 1 2.5 12 33.3 192 7.43 333 120
9235 1 11 1 2.5 12 33.3 168 7.46 333 120
9238 0 11 1 2.5 12 33.3 192 7.42 333 120
9238 1 11 1 2.5 12 33.3 188 7.43 333 120
9241 0 11 1 2.5 12 33.3 172  7.4 333 120
9241 1 11 1 2.5 12 33.3 164 7.43 333 120
9244 0 11 1 2.5 12 33.3 180 7.44 333 120
9244 1 11 1 2.5 12 33.3 172 7.44 333 120
9247 0 11 1 2.5 12 33.3 188 7.41 333 120
9247 1 11 1 2.5 12 33.3 180 7.43 333 120
9250 0 11 1 2.5 12 33.3 192 7.43 333 120
9250 1 11 1 2.5 12 33.3 180 7.42 333 120
9256 0 11 2 2.5 12 33.3 208 7.16 333 120
9256 1 11 2 2.5 12 33.3 204 7.47 333 120
9257 0 11 2 2.5 12 33.3 196 7.38 333 120
9257 1 11 2 2.5 12 33.3 172 7.45 333 120
9258 0 11 2 2.5 12 33.3 192 7.35 333 120
9258 1 11 2 2.5 12 33.3 164 7.47 333 120
9260 0 11 2 2.5 12 33.3 176 7.41 333 120
9260 1 11 2 2.5 12 33.3 160 7.49 333 120
9263 0 11 2 2.5 12 14.3 188 6.83 143 120
9263 1 11 2 2.5 12 14.3 164  6.9 143 120
9264 0 11 2 2.5 12 14.3 192  6.7 143 120
9264 1 11 2 2.5 12 14.3 172 6.76 143 120
9265 0 11 2 2.5 12 33.3 176  7.4 333 120
9265 1 11 2 2.5 12 33.3 176 7.44 333 120
9266 0 11 2 2.5 12 14.3 196 6.24 143 120
9266 1 11 2 2.5 12 14.3 172 6.41 143 120
9268 0 11 2 2.5 12 14.3 192  6.6 143 120
9268 1 11 2 2.5 12 14.3 136 6.69 143 120
9269 0 11 2 2.5 12 33.3 188 7.39 333 120
9269 1 11 2 2.5 12 33.3 180 7.42 333 120
9270 0 11 2 2.5 12 14.3 204  6.7 143 120
9270 1 11 2 2.5 12 14.3 168 6.77 143 120
9271 0 11 2 2.5 12 14.3 204 6.63 143 120
9271 1 11 2 2.5 12 14.3 172 6.71 143 120
9273 0 11 2 2.5 12 33.3 204 7.41 333 120
9273 1 11 2 2.5 12 33.3 188 7.44 333 120
9274 0 11 2 2.5 12 33.3 176 7.42 333 120
9274 1 11 2 2.5 12 33.3 148 7.45 333 120
9374 0 11 1 2.5 12 17.9 168 6.97 179 120
9374 1 11 1 2.5 12 17.9 152 6.92 179 120
9379 0 11 1 2.5 12 17.9 192 6.92 179 120
9379 1 11 1 2.5 12 17.9 188 6.97 179 120
9382 0 11 1 2.5 12 17.9 176  6.8 179 120
9382 1 11 1 2.5 12 17.9 184 6.83 179 120
9384 0 11 1 2.5 12 17.9 192 6.87 179 120
9384 1 11 1 2.5 12 17.9 156 6.91 179 120
9385 0 11 1 2.5 12 17.9 200 6.85 179 120
9385 1 11 1 2.5 12 17.9 160 6.91 179 120
9386 0 11 1 2.5 12 17.9 192 6.87 179 120
9386 1 11 1 2.5 12 17.9 164 6.93 179 120
9387 0 11 2 2.5 12 33.3 200 7.39 333 120
9387 1 11 2 2.5 12 33.3 188 7.44 333 120
9388 0 11 2 2.5 12 33.3 184 7.41 333 120
9388 1 11 2 2.5 12 33.3 176 7.49 333 120
9389 0 11 2 2.5 12 14.3 180 6.64 143 120
9389 1 11 2 2.5 12 14.3 144 6.75 143 120
9390 0 11 2 2.5 12 14.3 184 6.55 143 120
9390 1 11 2 2.5 12 14.3 144 6.63 143 120
9391 0 11 2 2.5 12 14.3 196 6.53 143 120
9391 1 11 2 2.5 12 14.3 160 6.62 143 120
9392 0 11 2 2.5 12 14.3 196 6.52 143 120
9392 1 11 2 2.5 12 14.3 160 6.61 143 120
9395 0 11 2 2.5 12 17.9 188  6.8 179 120
9395 1 11 2 2.5 12 17.9 168 6.87 179 120
9398 0 11 2 2.5 12 17.9 160 6.84 179 120
9398 1 11 2 2.5 12 17.9 132  6.9 179 120
9400 0 11 1 2.5 12 17.9 184 6.89 179 120
9400 1 11 1 2.5 12 17.9 152 6.96 179 120
9401 0 11 2 2.5 12 17.9 192 6.88 179 120
9401 1 11 2 2.5 12 17.9 164 6.92 179 120
9402 0 11 2 2.5 12 17.9 184 6.89 179 120
9402 1 11 2 2.5 12 17.9 140 6.94 179 120
9405 0 11 2 2.5 12 17.9 172 6.88 179 120
9405 1 11 2 2.5 12 17.9 168 6.91 179 120
9407 0 11 2 2.5 12 17.9 204 6.88 179 120
9407 1 11 2 2.5 12 17.9 164 6.94 179 120
9410 0 11 2 2.5 12 17.9 192 6.83 179 120
9410 1 11 2 2.5 12 17.9 164 6.93 179 120
9413 0 11 2 2.5 12 17.9 204 6.93 179 120
9413 1 11 2 2.5 12 17.9 176 6.93 179 120
9416 0 11 2 2.5 12 17.9 196  6.9 179 120
9416 1 11 2 2.5 12 17.9 184 6.88 179 120
9417 0 11 2 2.5 12 17.9 208 6.89 179 120
9417 1 11 2 2.5 12 17.9 184 6.89 179 120
9503 0 11 1 2.5 12 13.1 172  6.3 131 120
9503 1 11 1 2.5 12 13.1 172 6.38 131 120
9511 0 11 1 2.5 12 13.1 192 6.14 131 120
9511 1 11 1 2.5 12 13.1 104 6.31 131 120
9513 0 11 1 2.5 12 13.1 184 6.13 131 120
9513 1 11 1 2.5 12 13.1 156  6.3 131 120
9517 0 11 1 2.5 12 13.1 188 6.17 131 120
9517 1 11 1 2.5 12 13.1 136 6.38 131 120
9519 0 11 1 2.5 12 13.1 188 6.15 131 120
9519 1 11 1 2.5 12 13.1 112 6.34 131 120
9525 0 11 1 2.5 12 13.1 212 6.23 131 120
9525 1 11 1 2.5 12 13.1 128 6.47 131 120
9530 0 11 1 2.5 12 13.1 200 6.12 131 120
9530 1 11 1 2.5 12 13.1 124 6.08 131 120
end
label values gd GD
label def GD 11 "GD 11", modify

Stata 14.2MP
OS X

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#9

29 Aug 2017, 07:52

Nigel:
your appreciated clarification sheds light on why the number of observations is limited, but, I suspect, won't shelter your analysis from concerns about the consequences of the imbalance between the number of predictors and the scant sample size.

Kind regards,
Carlo
(Stata 19.0)
Comment
Nigel Moore

Join Date: Apr 2016

Posts: 79
#10

29 Aug 2017, 08:46

Carlo

I think you might misunderstand what I'm trying to achieve or, more likely, I misunderstand the problem (I'm not a statistician!)

The full study involves the exposure of embryos at two ages (day 11 and day 13) to either no drug, drug 1, or drug 2; at different drug concentrations; and different sb concentrations. The aim is to evaluate the interaction of drug and pH upon embryo heart rate.

We have measured heart rate and pH before and after exposure for one hour.

I have used a -mixed- model so that we can evaluate the data for repeated measures, for unbalanced data sets (some embryos die), and with measured sample pH as a covariate. The predicted -margins- for time correlate very well with the calculated group means; in fact, if c.ph is taken out of the model, the predicted margins are largely identical to the descriptive statistics (explained further here). So, although I'm using a large number of predictors, they seem to work. And, as you have already observed, the margins are unaffected whether or not I robustify the model output.

The only problem that I have is with the SE. Intuitively, robustification seems to give me 'better data' because without it the SE at time 0 and time 1 are the same. Using robust gives me SE that differ, which is the same as the descriptive statistics.

But I'm not trying to extrapolate the model beyond the data that I have already.

Stata 14.2MP
OS X
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#11

29 Aug 2017, 09:14

Nigel:
I suspect that we're tackling two different faces of the same coin.
I'm perfectly fine with your explanations about research strategy and robust standard errors.
My thought was about the rule of thumb according to which 20 observations per predictor are advised for multiple linear regression (Katz MH. Multivariable Analysis. Second Edtion. NY: Cambridge University Press, 2006: 81), even though 10 obs per predictor may sound wise enough.
I would consider this empirical requirements for -mixed-, too.

Last edited by Carlo Lazzaro; 29 Aug 2017, 09:48.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#12

29 Aug 2017, 12:22

Generally speaking, I fear say I still cannot understand the reason of so many combinations of interaction terms. They can become arcane. I suspect the audience (and the person in charge of the analysis) will have a tough time so as to reach a correct way to interpret the findings.

Particularly speaking, with such a small sample size, plenty of interaction terms will byte hard.

What is more, due to the potential lack of power, these interaction terms are unsurprisingly bound to be nonsignificant.

To end, yes, I fully agree with Carlo.

Best regards,

Marcos
Comment

Nigel Moore

Join Date: Apr 2016
Posts: 79

#13

29 Aug 2017, 13:24

Actually, Carlo's comment made me reflect on that model. In this instance, the conc10 interaction makes no sense, since conc was limited to 12 (conc10 is simply tenfold concentration to give an integer that could be used as a factor variable). Removing conc10 from the model made no difference to the outcome, but substantially reduced the collinearity and lack of observations warnings:

Code:

. mixed hr treat##ib333.sb10##time c.ph if gd==11 & conc==12 || id:
note: 1.treat#143.sb10 identifies no observations in the sample
note: 2.treat#131.sb10 identifies no observations in the sample
note: 2.treat#143.sb10 omitted because of collinearity
note: 1.treat#143.sb10#0.time identifies no observations in the sample
note: 1.treat#143.sb10#1.time identifies no observations in the sample
note: 2.treat#131.sb10#0.time identifies no observations in the sample
note: 2.treat#131.sb10#1.time identifies no observations in the sample
note: 2.treat#143.sb10#1.time omitted because of collinearity

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -454.86949  
Iteration 1:   log likelihood = -454.86674  
Iteration 2:   log likelihood = -454.86674  

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        114
Group variable: id                              Number of groups  =         57

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(12)     =     216.11
Log likelihood = -454.86674                     Prob > chi2       =     0.0000

---------------------------------------------------------------------------------
             hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        2.treat |   5.491956   6.082552     0.90   0.367    -6.429627    17.41354
                |
           sb10 |
           131  |   3.346835   22.90633     0.15   0.884    -41.54874    48.24241
           143  |   1.245154   15.04047     0.08   0.934    -28.23363    30.72394
           179  |   .3537231   10.65648     0.03   0.974     -20.5326    21.24004
                |
     treat#sb10 |
         1 143  |          0  (empty)
         2 131  |          0  (empty)
         2 143  |          0  (omitted)
         2 179  |  -1.610051   8.534654    -0.19   0.850    -18.33767    15.11756
                |
         1.time |  -13.54472   4.910127    -2.76   0.006    -23.16839   -3.921049
                |
     treat#time |
           2 1  |  -.6442154   7.008604    -0.09   0.927    -14.38083     13.0924
                |
      sb10#time |
         131 1  |  -43.80343   7.936117    -5.52   0.000    -59.35794   -28.24893
         143 1  |  -19.58241   6.923188    -2.83   0.005    -33.15161   -6.013212
         179 1  |   -2.72462   6.942433    -0.39   0.695    -16.33154     10.8823
                |
treat#sb10#time |
       1 143 0  |          0  (empty)
       1 143 1  |          0  (empty)
       2 131 0  |          0  (empty)
       2 131 1  |          0  (empty)
       2 143 1  |          0  (omitted)
       2 179 1  |  -8.588449   9.878976    -0.87   0.385    -27.95089    10.77399
                |
             ph |  -2.512655   17.70787    -0.14   0.887    -37.21945    32.19414
          _cons |   203.0313   131.3732     1.55   0.122     -54.4554    460.5181
---------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   62.30034   25.51654      27.91652    139.0335
-----------------------------+------------------------------------------------
               var(Residual) |   119.7879     22.451      82.96184    172.9608
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 7.08          Prob >= chibar2 = 0.0039

. mixed hr treat##ib333.sb10##time c.ph if gd==11 & conc==12 || id:, vce(robust)
note: 1.treat#143.sb10 identifies no observations in the sample
note: 2.treat#131.sb10 identifies no observations in the sample
note: 2.treat#143.sb10 omitted because of collinearity
note: 1.treat#143.sb10#0.time identifies no observations in the sample
note: 1.treat#143.sb10#1.time identifies no observations in the sample
note: 2.treat#131.sb10#0.time identifies no observations in the sample
note: 2.treat#131.sb10#1.time identifies no observations in the sample
note: 2.treat#143.sb10#1.time omitted because of collinearity

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log pseudolikelihood = -454.86949  
Iteration 1:   log pseudolikelihood = -454.86674  
Iteration 2:   log pseudolikelihood = -454.86674  

Computing standard errors:

Mixed-effects regression                        Number of obs     =        114
Group variable: id                              Number of groups  =         57

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(12)     =     302.95
Log pseudolikelihood = -454.86674               Prob > chi2       =     0.0000

                                       (Std. Err. adjusted for 57 clusters in id)
---------------------------------------------------------------------------------
                |               Robust
             hr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
        2.treat |   5.491956   4.775442     1.15   0.250    -3.867738    14.85165
                |
           sb10 |
           131  |   3.346835   15.81877     0.21   0.832    -27.65738    34.35105
           143  |   1.245154   10.07978     0.12   0.902    -18.51086    21.00117
           179  |   .3537231   7.098563     0.05   0.960    -13.55921    14.26665
                |
     treat#sb10 |
         1 143  |          0  (empty)
         2 131  |          0  (empty)
         2 143  |          0  (omitted)
         2 179  |  -1.610051   7.118917    -0.23   0.821    -15.56287    12.34277
                |
         1.time |  -13.54472   2.922665    -4.63   0.000    -19.27304   -7.816404
                |
     treat#time |
           2 1  |  -.6442154   4.297932    -0.15   0.881    -9.068008    7.779577
                |
      sb10#time |
         131 1  |  -43.80343   12.33814    -3.55   0.000    -67.98575   -19.62112
         143 1  |  -19.58241   4.301532    -4.55   0.000    -28.01326   -11.15156
         179 1  |   -2.72462   6.381934    -0.43   0.669    -15.23298    9.783741
                |
treat#sb10#time |
       1 143 0  |          0  (empty)
       1 143 1  |          0  (empty)
       2 131 0  |          0  (empty)
       2 131 1  |          0  (empty)
       2 143 1  |          0  (omitted)
       2 179 1  |  -8.588449   7.874498    -1.09   0.275    -24.02218    6.845283
                |
             ph |  -2.512655   11.25905    -0.22   0.823    -24.57999    19.55468
          _cons |   203.0313   83.64416     2.43   0.015     39.09179    366.9709
---------------------------------------------------------------------------------

------------------------------------------------------------------------------
                             |               Robust          
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity                 |
                  var(_cons) |   62.30034   27.94596      25.86237    150.0764
-----------------------------+------------------------------------------------
               var(Residual) |   119.7879   31.99139      70.97171    202.1812
------------------------------------------------------------------------------

.

The remaining no observations warnings are a reflection of the fact that, as stated above, drug 1 was not tested at 14.3mM sb and drug 2 was not tested at 13.1mM sb.

I fear say I still cannot understand the reason of so many combinations of interaction terms

We know that pH affects the partitioning of the drugs into the embryo, therefore there is a conc×pH interaction. We also know that sequestration increases over time, so there is a conc×pH×time interaction. We need to separate the two drugs for analysis, but believe that their disposition and effects are largely the same, hence the treat interaction.

I do realise that there are a lot of interactions, but it's a complex biology. I did also run a model without interactions, and the margins bore little resemblance to the measured data, while the interaction model does. I also realise that the observations are limited, but we have ethical considerations in terms of animal numbers.

All I was trying to get at was when we should use robust analysis. In this case, it offers both advantages and disadvantages. That's not the point though. I know that some of the data are very much not Gaussian, and was trying to understand if/when robustification should be introduced on a generic basis:

Click image for larger version

Name: example.png
Views: 1
Size: 193.6 KB
ID: 1408401

Last edited by Nigel Moore; 29 Aug 2017, 13:42.

Stata 14.2MP
OS X

Comment

Nigel Moore

Join Date: Apr 2016

Posts: 79
#14

31 Aug 2017, 02:37

I started this with the question as to when the vce(robust) option should be used. In order to actually define an answer, I regressed the predicted SE (without the c.ph covariate, with and without the vce option) against SE derived from -mean-.

Without vce(robust): coeff=0.3243325, cons=3.081366, R²=0.2232
With vce(robust): coeff=0.8276718, cons=0.5294467, R²=0.8230

In this case at least, the option improves the output, and gives me predicted SE that are closer to the measured values.

I can only conclude that, in the absence of any useful output from the -mixed- regression, the answer to the original question is 'suck it and see'.

Last edited by Nigel Moore; 31 Aug 2017, 02:41.

Stata 14.2MP
OS X
Comment

Announcement

Robust: when to use it in a -mixed- regression?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment