Intraclass correlation in longitudinal data

Dayane Rocha

Join Date: Aug 2016

Posts: 7
#1

Intraclass correlation in longitudinal data

17 Aug 2016, 07:00

Hello Stata users,

I'm trying to understand an example in Stata Manual of hierarchical models in longitudinal data. The example is in the document of mixed command (page 294) and uses the pigs dataset.
Well, I ran a null model for computing the ICC coefficient (that isn't in the material of mixed) and the ICC is quite small, about zero...
So, I'm confused about the interpretation of ICC in longitudinal data. Can anyone help me?
Thank you!

My script for this and the results:

. use http://www.stata-press.com/data/r13/pig, replace
(Longitudinal analysis of pig weights)

. twoway connected weight week if id<=10, connect(L)

.
. *Null model for calculate icc:
. mixed weight || id:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0: log likelihood = -1828.2965
Iteration 1: log likelihood = -1827.213
Iteration 2: log likelihood = -1827.2118
Iteration 3: log likelihood = -1827.2118 (backed up)

Computing standard errors:

Mixed-effects ML regression Number of obs = 432
Group variable: id Number of groups = 48

Obs per group: min = 9
avg = 9.0
max = 9

Wald chi2(0) = .
Log likelihood = -1827.2118 Prob > chi2 = .

------------------------------------------------------------------------------
weight | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | 50.40509 .7997195 63.03 0.000 48.83767 51.97251
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity |
var(_cons) | 4.52e-13 3.25e-12 3.45e-19 5.92e-07
-----------------------------+------------------------------------------------
var(Residual) | 276.2861 18.79895 241.792 315.7012
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 1.8e-12 Prob >= chibar2 = 1.0000

. estat icc

Intraclass correlation

------------------------------------------------------------------------------
Level | ICC Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
id | 1.63e-15 0 1.63e-15 1.63e-15
------------------------------------------------------------------------------
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

17 Aug 2016, 08:57

What you've done is absolutely correct. There is no problem here.

The intraclass correlation is the intraclass correlation of the residuals in the model, not of the dependent variable. So, in the null model, each pig's weight varies enormously over time. If you run

Code:

xtset id week xtline weight, overlay legend(off)

you will see that the intra-pig weight variation is enormous and the between-pig weight variation is, relative to that, quite small. So the intraclass correlation is quite close to zero.

By contrast when you run -mixed weight week || id:-, the trend of weight gain over the weeks is removed from the residuals. In the de-trended data, were you to calculate the fitted values and residuals and similarly plot them, you would see that most of the pigs' weights hews fairly close to their predicted values over time, but that there is substantial variation of predicted values between pigs. So this time most of the variation is between pigs, with little within each pig, and the ICC is high.

Added: Correction of terminology. When I referred to residuals above, what I really mean is weight - xb. I just remembered that if you -predict, resid- following -mixed-, you get weight - xb - u. So I am not referring to the residuals that -predict- calculates after -mixed- here. Try:

Code:

predict xb, xb gen total_error = weight - xb xtline total_error, overlay legend(off)

and you will see that each pig's trajectory is close to a horizontal line, but the different trajectories are rather spread out from each other. So the bulk of the variation here is between pigs, with little variation within pigs, hence a high ICC.

Last edited by Clyde Schechter; 17 Aug 2016, 09:01.
1 like
Comment
Dayane Rocha

Join Date: Aug 2016

Posts: 7
#3

17 Aug 2016, 16:29

Hello Mr. Schechter!

Thank you so much for your help! Now it's more clear to me.
But I have a question yet. Please correct me: I've learned that ICC (on null model) is a way to verify the need to use Hierarchical Models on my dataset, at least in cross sections. And that the insertion of others variables in the model should reduce the ICC.
This example results a very low ICC (near zero), however a model using a hierachical data is the objective of the example on the manual (like you mentioned: "mixed weight week || id").
How can I justify the need of use hierarchical models if the ICC is zero? There is something different in interpretation when I have longitudinal data?

Thank you again!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#4

17 Aug 2016, 17:12

I've learned that ICC (on null model) is a way to verify the need to use Hierarchical Models on my dataset, at least in cross sections.

I think that what you learned is incorrect.

The closest to that statement that I would endorse is that if you run a two-level model but the ICC from that model comes out close to zero, it may be reasonable to simplify the model to a single level.

It is certainly wrong to claim that the ICC derived from the null model has any bearing at all (in either direction) on the need for hierarchical structure in some different model. Just as the pigs data gives an example where the null model ICC is effectively zero, but a model including a time trend clearly needs the two-level structure, you could have a situation where it worked the other way. The null model ICC could be very high, due to the effects of some "pig"-level variable, but a model which included that variable might do very well without a second level if that variable accounted for nearly all of the "pig"-level variation.

I don't understand the distinction you are drawing between cross-sectional and longitudinal data in this context. If your data are fully cross-sectional, that is, you only have one observation per "pig" then there is no second level in the data and there is no ICC to calculate. If your data consist of multiple observations per "pig," that are replicates on some dimension other than time, then you have two levels and an ICC can be calculated. The fact that the dimension at the bottom level is replication but not time makes no difference and the logic of 2-level vs 1-level modeling remains the same.

Justifying the use of a hierarchical model for some problem is done by fitting that hierarchical model and observing that the higher level variance estimate is appreciably different from zero (equivalently, the ICC for that model is appreciably different from zero.) The ICC from the null model, or any different model, is irrelevant. You don't even have to go out of your way to calculate anything extra, not even the ICC, for this. If you look at the very last line of Stata's output from -mixed- it shows you a likelihood ratio test comparing that model to a linear model (by which it means a model without hierarchy), and you can simply see how that test turns out. A high chi square and low p-value suggest that the hierarchical model is preferable to a flat model.

Added: Actually, I would stand your question on its head. If the data contains multiple levels, you do not need to justify using a hierarchical model. You need to justify not using one. When there are replicate observations per pig, whether over time, or anything else, those observations are, in most situations, not independent, so that inferences based on a one-level model (which assumes all observations are independent) will be incorrect. A one-level model is only admissible if you have one of those uncommon situations where the errors in the replicate observations are actually independent. That independence is measured by the ICC of the hierarchical model's error terms (and tested by the likelihood ratio test I mentioned in the preceding paragraph). Unless those results tell you that the replicates really are independent, it is just flat out wrong to use a one-level model. And that kind of independence among replicate observations is very much the exception, not the rule, in the real world.

Last edited by Clyde Schechter; 17 Aug 2016, 17:18.
3 likes
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#5

18 Aug 2016, 02:09

I strongly support Clyde's clear and informative discussion. One possible small area of difference between us concerns his remark that:

If the data contains multiple levels, you do not need to justify using a hierarchical model. You need to justify not using one.

Addressing the hierarchical nature of the data structure does not necessary imply having to use a multi-level/mixed/random effects model. Economists in particular are fond of using fixed effects models in such contexts because unlike MLM/mixed/RE models, one doesn't have to assume that the random effects and "fixed" predictors are uncorrelated.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#6

18 Aug 2016, 09:27

I agree with Stephen Jenkins. I was focused on the choice between a "flat" model and a random effects model, but other options such as fixed effects models may exist as well. What I think can be said in full generality, is that if the data has a multi-level structure, a proper analysis must account for it in some way unless you can demonstrate that the structure's effects in that particular data are ignorable.
1 like
Comment
Dayane Rocha

Join Date: Aug 2016

Posts: 7
#7

18 Aug 2016, 16:42

Hello Mr. Schechter and Mr. Jenkins,

I'm learning a lot from the discussion, thank you.

I'm starting to learn about these hierarchical models, and one book that I used (Raudenbush and Bryk) gives an example of math achievement of students (id) in different schools (group). This is the context of "cross-section" data that I'd mentioned before. Well, in this case, for demonstrate that schools really have different performances (and to justify the need for hierarchical models), they do a "one-way ANOVA" and calculate the ICC. Is there problems with my logic here too?

Then I tried to learn how hierarchical models works in contexts of longitudinal data. It was when I found the example mentioned in the Stata Manual.

Thanks!!
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#8

18 Aug 2016, 17:04

Originally posted by Dayane Rocha View Post

. . . (Raudenbush and Bryk) gives an example of math achievement of students (id) in different schools (group). This is the context of "cross-section" data that I'd mentioned before. Well, in this case, for demonstrate that schools really have different performances (and to justify the need for hierarchical models), they do a "one-way ANOVA" and calculate the ICC. Is there problems with my logic here too?

Raudenbush and Bryk considered students to be exchangeable within schools in their cross-sectional sample. Because the pig weights are longitudinal, the weights are not exchangeable within pigs. There is relevant information in the sequence (time since birth) of the weight. It is important to include this relevant information in the model in order to obtain a better estimate of the intraclass correlation coefficient.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#9

18 Aug 2016, 17:12

I read Raudenbush and Bryk several years back, too. I vaguely remember the example you mention, but not well enough to comment on it here. And I don't have the book any more.

I think multi-level modeling has a very steep learning curve. It took me a very long time to get comfortable with it, and I still don't feel I've really mastered the subject. Perhaps that is because it is one of the things I've learned in recent years, as I'm getting older. But I really do think it's more difficult material than most of statistics. Anyway, if you would like another resource for leaning about these, I found the online course offered by Bristol University (at no charge!) extremely helpful. You can tailor your path through it based on your current level of knowledge, and you get good feedback on your progress after each session. Also, the examples and exercises are available in Stata (though they do not use the most recent version of Stata). And I got a lot more out of my second reading of Raudenbush and Bryk after doing the Bristol course.
2 likes
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4410
#10

18 Aug 2016, 20:15

In light of Clyde's post, I would rather that I had said, "One-way ANOVA considers students to be exchangeable within schools in the cross-sectional sample."
1 like
Comment
Dayane Rocha

Join Date: Aug 2016

Posts: 7
#11

19 Aug 2016, 20:54

Mr Coveney , I had not thought of the possibility of exchange, thank you!
I enjoyed the discussion and I'm very grateful for the attention and for the nice and intelligent answers that you all gave to me.
Mr. Schechter, I will do this course for sure , thanks for the tip!!
Comment

Announcement

Intraclass correlation in longitudinal data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment