Comparing regression estimates across different outcome models and testing whether there is heterogeneity

Jenny Williams

Join Date: Sep 2017
Posts: 35

Comparing regression estimates across different outcome models and testing whether there is heterogeneity

13 Apr 2018, 19:26

Hi Statalisters,

Suppose I am interested in examining two outcomes within my data set. And suppose these two outcomes represent related concepts. If I run two separate outcome models for a given set of explanatory variables, I am wondering if there is a way to compare whether estimates are consistent across models?

For example, say I am interested in looking at length and weight as my two outcomes. I could run the models with the same set of explanatory variables and compare whether the estimates are in the same direction or magnitude.

Code:

sysuse auto, clear

/* dichotomize at median to generate binary outcome 1 */
generate largeyn=1 if length>=192.5 & length!=.
replace largeyn=0 if length<192.5
tab largeyn, missing

/* dichotomize at median to generate binary outcome 2 */
generate heavyyn=1 if weight>=3190 & weight!=.
replace heavyyn=0 if weight<3190
tab heavyyn, missing

/* manually compare estimates from each model */
logistic largeyn trunk turn headroom
logistic heavyyn trunk turn headroom


. logistic largeyn trunk turn headroom

Logistic regression                               Number of obs   =         74
                                                  LR chi2(3)      =      76.12
                                                  Prob > chi2     =     0.0000
Log likelihood = -13.231694                       Pseudo R2       =     0.7420

------------------------------------------------------------------------------
     largeyn | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       trunk |   1.825171   .4418256     2.49   0.013     1.135665    2.933303
        turn |   1.956289   .3878218     3.38   0.001      1.32644    2.885215
    headroom |   1.163899   .9530395     0.19   0.853     .2338425    5.793053
       _cons |   6.51e-16   5.88e-15    -3.87   0.000     1.31e-23    3.23e-08
------------------------------------------------------------------------------

. logistic heavyyn trunk turn headroom

Logistic regression                               Number of obs   =         74
                                                  LR chi2(3)      =      71.06
                                                  Prob > chi2     =     0.0000
Log likelihood = -15.762918                       Pseudo R2       =     0.6927

------------------------------------------------------------------------------
     heavyyn | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       trunk |   1.286904   .2192673     1.48   0.139     .9215427    1.797119
        turn |   2.036741   .3769406     3.84   0.000      1.41711    2.927306
    headroom |   1.690922   1.260841     0.70   0.481     .3921233    7.291629
       _cons |   4.56e-15   3.56e-14    -4.24   0.000     1.07e-21    1.95e-08
------------------------------------------------------------------------------

/*
COMPARE ESTIMATES ACROSS MODELS:
for trunk: 1.83 versus 1.29
for turn: 1.96 versus 2.04
for headroom: 1.16 versus 1.69
*/

However, this would be very subjective. Is there a way to generate a statistical test of heterogeneity similar to what a Q-statistic would give in a meta-analysis?

Would it be possible to incorporate an interaction term in the models? For example, if we considered the two outcome models as being "stratified" models from some pseudo population that contains duplicate observations for each individual. Ignoring the fact that the observations are correlated, would something similar to the below code work?

Code:

sysuse auto, clear

/* dichotomize at median to generate binary outcome 1 */
generate largeyn=1 if length>=192.5 & length!=.
replace largeyn=0 if length<192.5
tab largeyn, missing

/* dichotomize at median to generate binary outcome 2 */
generate heavyyn=1 if weight>=3190 & weight!=.
replace heavyyn=0 if weight<3190
tab heavyyn, missing

/* manually compare estimates from each model */
logistic largeyn trunk turn headroom
logistic heavyyn trunk turn headroom


/* create pseudo dataset */

preserve
generate model=0
generate outcome=largeyn
count
save pseudo1.dta, replace
restore

preserve
generate model=1
generate outcome=heavyyn
count
save pseudo2.dta, replace
restore

/* append the two datasets */

clear
use pseudo1.dta, clear
append using pseudo2.dta
count

/* run stratified models using psuedo data, should be same as above */
logistic outcome trunk turn headroom if model==0
logistic outcome trunk turn headroom if model==1

/* instead of running stratified models, compare estimates using statistical interaction test */
logistic outcome c.trunk##i.model c.turn##i.model c.headroom##i.model

. logistic outcome c.trunk##i.model c.turn##i.model c.headroom##i.model

Logistic regression                               Number of obs   =        148
                                                  LR chi2(7)      =     147.18
                                                  Prob > chi2     =     0.0000
Log likelihood = -28.994612                       Pseudo R2       =     0.7174

----------------------------------------------------------------------------------
         outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
           trunk |   1.825171   .4418256     2.49   0.013     1.135665    2.933303
         1.model |   7.014033   83.71004     0.16   0.870     4.87e-10    1.01e+11
                 |
   model#c.trunk |
              1  |   .7050868   .2087225    -1.18   0.238     .3946996    1.259559
                 |
            turn |   1.956289   .3878218     3.38   0.001      1.32644    2.885215
                 |
    model#c.turn |
              1  |   1.041125   .2823573     0.15   0.882     .6118622    1.771545
                 |
        headroom |   1.163899   .9530395     0.19   0.853     .2338425    5.793053
                 |
model#c.headroom |
              1  |   1.452808   1.608938     0.34   0.736     .1657789    12.73171
                 |
           _cons |   6.51e-16   5.88e-15    -3.87   0.000     1.31e-23    3.23e-08
----------------------------------------------------------------------------------

/* display stratum-specific estimates for largeyn outcome*/
lincom _b[trunk]
lincom _b[turn]
lincom _b[headroom]

. lincom _b[trunk]

 ( 1)  [outcome]trunk = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.825171   .4418256     2.49   0.013     1.135665    2.933303
------------------------------------------------------------------------------

. lincom _b[turn]

 ( 1)  [outcome]turn = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.956289   .3878218     3.38   0.001      1.32644    2.885215
------------------------------------------------------------------------------

. lincom _b[headroom]

 ( 1)  [outcome]headroom = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.163899   .9530395     0.19   0.853     .2338425    5.793053
------------------------------------------------------------------------------

/* display stratum-specific estimates for heavyyn outcome */
lincom _b[trunk]+_b[1.model#c.trunk]
lincom _b[turn]+_b[1.model#c.turn]
lincom _b[headroom]+_b[1.model#c.headroom]

. lincom _b[trunk]+_b[1.model#c.trunk]

 ( 1)  [outcome]trunk + [outcome]1.model#c.trunk = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.286904   .2192673     1.48   0.139     .9215427    1.797119
------------------------------------------------------------------------------

. lincom _b[turn]+_b[1.model#c.turn]

 ( 1)  [outcome]turn + [outcome]1.model#c.turn = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   2.036741   .3769406     3.84   0.000      1.41711    2.927306
------------------------------------------------------------------------------

. lincom _b[headroom]+_b[1.model#c.headroom]

 ( 1)  [outcome]headroom + [outcome]1.model#c.headroom = 0

------------------------------------------------------------------------------
     outcome | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.690922   1.260841     0.70   0.481     .3921233    7.291629
------------------------------------------------------------------------------

/*

now, comparing the stratum-specific estimates calculated using the interaction term (same values as the stratified models):
for trunk: 1.83 versus 1.29
for turn: 1.96 versus 2.04
for headroom: 1.16 versus 1.69

but now we have a statistical test of heterogeneity (the p-value associated with the interaction term):
trunk estimates not significantly different across outcome models (interaction p-value 0.238)
turn estimates not significantly different across outcome models (interaction p-value 0.882)
headroom estimates not significantly different across outcome models (interaction p-value 0.736)

*/

/* remove data */
rm pseudo1.dta
rm pseudo2.dta

Last edited by Jenny Williams; 13 Apr 2018, 19:38.

Tags: None

Joseph Coveney

Join Date: Apr 2014

Posts: 4540
#2

13 Apr 2018, 21:47

Originally posted by Jenny Williams View Post

Suppose I am interested in examining two outcomes within my data set. And suppose these two outcomes represent related concepts. If I run two separate outcome models for a given set of explanatory variables, I am wondering if there is a way to compare whether estimates are consistent across models?

If I follow you correctly, then you can get what you're after using gsem. See below (start at the Begin here comment.)

.ÿversionÿ15.1

.ÿ
.ÿclearÿ*

.ÿ
.ÿquietlyÿsysuseÿauto

.ÿ
.ÿquietlyÿcentileÿlength

.ÿgenerateÿbyteÿlargeÿ=ÿlengthÿ>ÿr(c_1)

.ÿ
.ÿquietlyÿcentileÿweight

.ÿgenerateÿbyteÿheavyÿ=ÿweightÿ>ÿr(c_1)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿ
.ÿ//ÿWaldÿtest
.ÿgsemÿ(largeÿheavyÿ<-ÿc.(trunkÿturnÿheadroom),ÿlogit),ÿnodvheaderÿnolog

GeneralizedÿstructuralÿequationÿmodelÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
Logÿlikelihoodÿ=ÿ-28.994612

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
largeÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿtrunkÿ|ÿÿÿ.6016738ÿÿÿ.2420735ÿÿÿÿÿ2.49ÿÿÿ0.013ÿÿÿÿÿ.1272184ÿÿÿÿ1.076129
ÿÿÿÿÿÿÿÿturnÿ|ÿÿÿ.6710491ÿÿÿ.1982437ÿÿÿÿÿ3.38ÿÿÿ0.001ÿÿÿÿÿ.2824987ÿÿÿÿÿÿ1.0596
ÿÿÿÿheadroomÿ|ÿÿÿÿ.151776ÿÿÿ.8188332ÿÿÿÿÿ0.19ÿÿÿ0.853ÿÿÿÿ-1.453108ÿÿÿÿ1.756659
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-34.96852ÿÿÿ9.041231ÿÿÿÿ-3.87ÿÿÿ0.000ÿÿÿÿ-52.68901ÿÿÿ-17.24803
-------------+----------------------------------------------------------------
heavyÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿtrunkÿ|ÿÿÿ.2522394ÿÿÿ.1703835ÿÿÿÿÿ1.48ÿÿÿ0.139ÿÿÿÿ-.0817062ÿÿÿÿÿ.586185
ÿÿÿÿÿÿÿÿturnÿ|ÿÿÿ.7113509ÿÿÿ.1850704ÿÿÿÿÿ3.84ÿÿÿ0.000ÿÿÿÿÿ.3486196ÿÿÿÿ1.074082
ÿÿÿÿheadroomÿ|ÿÿÿÿ.525274ÿÿÿ.7456529ÿÿÿÿÿ0.70ÿÿÿ0.481ÿÿÿÿ-.9361789ÿÿÿÿ1.986727
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ-33.0206ÿÿÿ7.790509ÿÿÿÿ-4.24ÿÿÿ0.000ÿÿÿÿ-48.28972ÿÿÿ-17.75149
------------------------------------------------------------------------------

.ÿtestÿ[large]trunkÿ=ÿ[heavy]trunk,ÿnotest

ÿ(ÿ1)ÿÿ[large]trunkÿ-ÿ[heavy]trunkÿ=ÿ0

.ÿtestÿ[large]turnÿ=ÿ[heavy]turn,ÿnotestÿaccumulate

ÿ(ÿ1)ÿÿ[large]trunkÿ-ÿ[heavy]trunkÿ=ÿ0
ÿ(ÿ2)ÿÿ[large]turnÿ-ÿ[heavy]turnÿ=ÿ0

.ÿtestÿ[large]headroomÿ=ÿ[heavy]headroom,ÿaccumulate

ÿ(ÿ1)ÿÿ[large]trunkÿ-ÿ[heavy]trunkÿ=ÿ0
ÿ(ÿ2)ÿÿ[large]turnÿ-ÿ[heavy]turnÿ=ÿ0
ÿ(ÿ3)ÿÿ[large]headroomÿ-ÿ[heavy]headroomÿ=ÿ0

ÿÿÿÿÿÿÿÿÿÿÿchi2(ÿÿ3)ÿ=ÿÿÿÿ1.80
ÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.6142

.ÿ
.ÿ//ÿLikelihood-ratioÿtest
.ÿestimatesÿstoreÿUnconstrained

.ÿ
.ÿconstraintÿdefineÿ1ÿ[large]trunkÿ=ÿ[heavy]trunk

.ÿconstraintÿdefineÿ2ÿ[large]turnÿ=ÿ[heavy]turn

.ÿconstraintÿdefineÿ3ÿ[large]headroomÿ=ÿ[heavy]headroom

.ÿgsemÿ(largeÿheavyÿ<-ÿc.(trunkÿturnÿheadroom),ÿlogit),ÿ///
>ÿÿÿÿÿÿÿÿÿconstraints(1/3)ÿnocnsreportÿnodvheaderÿnolog

GeneralizedÿstructuralÿequationÿmodelÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ74
Logÿlikelihoodÿ=ÿ-29.965072

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
largeÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿtrunkÿ|ÿÿÿ.3993953ÿÿÿÿ.136128ÿÿÿÿÿ2.93ÿÿÿ0.003ÿÿÿÿÿ.1325894ÿÿÿÿ.6662012
ÿÿÿÿÿÿÿÿturnÿ|ÿÿÿ.6713905ÿÿÿÿ.128934ÿÿÿÿÿ5.21ÿÿÿ0.000ÿÿÿÿÿ.4186844ÿÿÿÿ.9240965
ÿÿÿÿheadroomÿ|ÿÿÿ.3462579ÿÿÿ.5404406ÿÿÿÿÿ0.64ÿÿÿ0.522ÿÿÿÿ-.7129862ÿÿÿÿ1.405502
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-32.84459ÿÿÿ5.596455ÿÿÿÿ-5.87ÿÿÿ0.000ÿÿÿÿ-43.81344ÿÿÿ-21.87574
-------------+----------------------------------------------------------------
heavyÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿtrunkÿ|ÿÿÿ.3993953ÿÿÿÿ.136128ÿÿÿÿÿ2.93ÿÿÿ0.003ÿÿÿÿÿ.1325894ÿÿÿÿ.6662012
ÿÿÿÿÿÿÿÿturnÿ|ÿÿÿ.6713905ÿÿÿÿ.128934ÿÿÿÿÿ5.21ÿÿÿ0.000ÿÿÿÿÿ.4186844ÿÿÿÿ.9240965
ÿÿÿÿheadroomÿ|ÿÿÿ.3462579ÿÿÿ.5404406ÿÿÿÿÿ0.64ÿÿÿ0.522ÿÿÿÿ-.7129862ÿÿÿÿ1.405502
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-32.84459ÿÿÿ5.596455ÿÿÿÿ-5.87ÿÿÿ0.000ÿÿÿÿ-43.81344ÿÿÿ-21.87574
------------------------------------------------------------------------------

.ÿlrtestÿUnconstrained

Likelihood-ratioÿtestÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(3)ÿÿ=ÿÿÿÿÿÿ1.94
(Assumption:ÿ.ÿnestedÿinÿUnconstrained)ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.5848

.ÿ
.ÿexit

endÿofÿdo-file

.
Comment
Jenny Williams

Join Date: Sep 2017

Posts: 35
#3

16 Apr 2018, 20:00

Professor Coveney, thank you very much for this wonderful answer. I also realized that I was doing a manual "suest" (without the clustering) by stacking the data.
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4540

17 Apr 2018, 00:57

If you want to replicate the results that you get with suest by using gsem, then use the vce(robust) option. Run the do-file below and compare the results to prove to yourself that they are the same.

You can also fit a seemingly unrelated bivariate probit regression model using biprobit. (You can fit this model, too, using gsem.)

I'm not sure that I would go this route, but if you want to "stack" the data and test an interaction term, then you might want to look into something like that below with xt (or corresponding me) commands.

Code:

version 15.1

clear *

quietly sysuse auto

quietly centile length
generate byte large = length > r(c_1)

quietly centile weight
generate byte heavy = weight > r(c_1)

// -suest-
quietly logit large c.(trunk turn headroom)
estimates store L

quietly logit heavy c.(trunk turn headroom)
estimates store H

suest L H
test [L_large]trunk = [H_heavy]trunk, notest
test [L_large]turn = [H_heavy]turn, notest accumulate
test [L_large]headroom = [H_heavy]headroom, accumulate

// cf.
gsem (large heavy <- c.(trunk turn headroom), logit), vce(robust) nodvheader nolog
test [large]trunk = [heavy]trunk, notest
test [large]turn = [heavy]turn, notest accumulate
test [large]headroom = [heavy]headroom, accumulate

// fitting a bivariate-probit model
biprobit large heavy c.(trunk turn headroom), nolog
test [large]trunk = [heavy]trunk
test [large]turn = [heavy]turn, accumulate
test [large]headroom = [heavy]headroom, accumulate

// stacking data, with interaction term
encode make, generate(Make) label(Makes)
keep large heavy trunk turn headroom Make
rename large dep1
rename heavy dep2
quietly reshape long dep, i(Make) j(outcome)

xtlogit dep i.outcome##c.(trunk turn headroom), i(Make) re nolog
test 2.outcome#c.trunk 2.outcome#c.turn 2.outcome#c.headroom

estimates store Full
xtlogit dep i.outcome c.(trunk turn headroom), i(Make) re nolog
lrtest Full

// or
xtprobit dep i.outcome##c.(trunk turn headroom), i(Make) re nolog
test 2.outcome#c.trunk 2.outcome#c.turn 2.outcome#c.headroom

exit

Comment

Jenny Williams

Join Date: Sep 2017

Posts: 35
#5

17 Apr 2018, 01:28

Thank you again Joseph for another excellent response. This summarizes what I have been researching over the past few days and is a handy reference. So now my question is - if I am examining a bivariate (or multivariate) logistic regression model, and if I am interested in examining whether the impact of my exposure variables (trunk turn headroom) are different across the various outcome variables (large and heavy), will these models (in particular the SEM/suest simultaneous models as above) allow me to test for these differences? I think with the robust estimators, it should account for any correlation between the outcomes. But do I really need to look at incorporating other methodological considerations such as the Allison 1989 paper for unobserved heterogeneity or look at comparing outcomes on the probability scale?
Comment

Announcement

Comparing regression estimates across different outcome models and testing whether there is heterogeneity

Comment

Comment

Comment

Comment