Clustering analysis of longitudinal data

Christopher Kaufmann

Join Date: Dec 2014

Posts: 9
#1

Clustering analysis of longitudinal data

08 Feb 2016, 15:39

I am conducting an analysis using data on 40 participants who reported how many hours they slept (on a 5 point scale) every night over 11 weeks. I would like to conduct two clustering analyses to:

1) identify groups of types of sleepers. For example, are there groups of individuals who are variable sleepers vs. more consistent sleepers?
2) identify patterns of sleeping periods within individuals. For example, for each participant over the 11 weeks, are there periods for which they sleep more hours or have more variable sleep than other periods?

I am aware there are data-mining techniques such as K-means, but I understand these methods are for static data and are not appropriate for clustering individuals based on their repeated measures. What are possible approaches I could take for these types of analyses?

Thanks everyone!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

08 Feb 2016, 22:46

Christopher:
as far as you question #1) is concerned, I would think about using -egen-, with -group- option, specifying some threshold values in terms of sleeping hours.

Kind regards,
Carlo
(Stata 19.0)
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#3

09 Feb 2016, 02:27

Is there only a single measure related to sleep or were there multiple items? You could decompose the variance on a single item using a mixed effects model. If you have multiple items at each time period and are trying to make inferences about a latent quality for sleep you would need to test measurement invariance across time periods (e.g., use the administration time for the group variable). As long as the latent satisfies your needs regarding measurement invariance, you could then use a latent growth curve model to similarly decompose the variance into with and between subject effects.
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#4

09 Feb 2016, 16:46

I suspect you have to add more structure to the problem. You can develop individual-level variables like variability in sleep hours or mean sleep hours. With that, you could analyse the data at the individual level. Alternatively, following @wbuchanan's suggestion, if these patterns could be observed at less than the individual level, a variety of mixed effects models might be helpful.

I think the problem is that you don't have a conventional dv (e.g., hours of sleep) for which a single latent quality might work. With the observed data as a bunch of sleep hours, I don't see how a single latent quality would explain whether some individuals have greater variation in sleep hours than others or whether individuals have periods of more or less sleep. Likewise, I don't think any of the clustering tools I know of would group individuals this way if you didn't create such variables (e.g., individual variation in sleep hours) up front. I suspect you have to create the universe of potential patterns (most likely using egen to do individual-level variables) before the analysis can see if individuals group naturally on those patterns.
Comment
Christopher Kaufmann

Join Date: Dec 2014

Posts: 9
#5

12 Feb 2016, 13:59

Thank you all for your help! This information is very helpful.
Comment

Announcement

Clustering analysis of longitudinal data

Comment

Comment

Comment

Comment