Generalized Estimating Equations

John Larsson

Join Date: May 2014

Posts: 33
#1

Generalized Estimating Equations

28 Aug 2014, 10:13

Hello,
I have a dataset with a binary outcome in which the observations are potentially correlated due to repeat individuals in the dataset. I have been using SPSS's Generalized Estimating Equations option for this where you can experiment with different working correlation matrix structures. Could someone kindly tell me the equivalent option in Stata? Also, is there an option to do post-estimation with the margins command?
Best regards,
John L.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30116
#2

28 Aug 2014, 10:33

The corresponding Stata command is -xtgee-, and the -corr()- option will let you specify several different kinds of working correlation structures. And, yes, the -margins- command will work its wonders afterwards. See the online help and manual section of -xtgee-.
Comment
John Larsson

Join Date: May 2014

Posts: 33
#3

28 Aug 2014, 13:36

Thanks very much for the response. I have my model up and running. I have a follow-up if I may. In SPSS, I previously got a statistics "Quasi Likelihood under Independence Model Criterion (QIC)" which is used to help select between models. Is there anything similar in Stata? Specifically I am interested in comparing a GEE model with an independent working correlation matrix to one that has an exchangeable structure.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#4

28 Aug 2014, 16:18

I'm not sure why you would want to choose between them. They're both consistent if the binary response is correctly specified. If you get notably different estimates then the probability model is misspecified. Or, the explanatory variables are not "strictly exogenous." Using any correlation structure other than independent means that the covariates in any period must be uncorrelated with the underlying errors in every period. A large difference between the independent and exchangeable estimates can indicate a failure of strict exogeneity. A goodness-of-fit test can't help resolve this issue because it doesn't care about endogenous explanatory variables.

If the estimates are similar enough, report both along with the robust standard errors. Or, choose one set to report as the main results and use the others as a sensitivity analysis.

JW
Comment
John Larsson

Join Date: May 2014

Posts: 33
#5

29 Aug 2014, 09:52

Hi Jeff,
Thanks for the help. In line with your comments, Agresti (2013, 463) writes that "Although the model parameter estimates are usually fine whatever working correlation assumption we choose, their model-based standard errors are not. More appropriate standard errors results from an adjustment the GEE method can make using the empirical dependence the data exhibit. The standard errors based on the working correlation assumption are updated using the empirical dependence to yield more appropriate more appropriate (robust) standard errors."
So if I request the vce(robust) option for the xtgee command, are the standard errors making this empirical adjustment?
JL
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#6

29 Aug 2014, 16:11

Yes, they are. In fact, subject to the usual problem that we only see a single sample of data, it makes sense to compare the standard errors to see if having a non-scalar working correlation matrix pays dividends. I've done a handful of examples with binary and fractional responses where using either an exchangeable or unstructured working correlation matrix seems to give no efficiency gain over the pooled method (with identity WCM). With linear models it seems to help more often.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#7

29 Aug 2014, 18:54

Originally posted by John Larsson View Post

In SPSS, I previously got a statistics "Quasi Likelihood under Independence Model Criterion (QIC)" which is used to help select between models. Is there anything similar in Stata?

Typing search qic at the Stata command line turns up a user-written command for QIC that you can install. You can give that a try, if you're still interested in light of what Jeff Wooldrige said.
Comment
John Larsson

Join Date: May 2014

Posts: 33
#8

30 Aug 2014, 19:40

Hi Joseph,
Thanks, I installed the qic utility. I don't know too much about this particular statistic but according to the criteria an "independent" correlation structure seems to yield the smallest QIC. This is interesting because an "exchangeable" structure seems to make the most theoretical sense in my particular case. And even though I get the lower QIC when using an independent structure, the estimated correlations when using an "exchangeable" structure are around 0.3, which strikes me as something that ought to be taken into account.
Cheers,
JL
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2169
#9

31 Aug 2014, 05:07

John: Your experience is why I'm suspicious of model selection statistics in the GEE context. There would be a proper way to use a goodness-of-fit statistic that depends on the first two moments, and this would be to evaluate the Gaussian log-likelihood with the first two moments plugged in. (The Gaussian distribution is a member of the quadratic exponential family, and therefore identifies the first two moments without any distributional assumption. So it will account for mean and variance estimates.) If this is what qic does, then fine. But I suspect it is not doing this. If it is simply taking the log-likelihood under independence and then it inserts the two different estimates, the estimated under independence always will win -- it is an algebraic fact from maximizing a function. So I would be very wary until you know exactly what the statistic is. I will try to get a moment to look at it.
Comment

Announcement

Generalized Estimating Equations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment