I’m trying to do optimal matching and cluster analysis for sequences of a variable with 3 states and 4 time points, formatted as follows:
Using the sq package, I was able to identify 81 unique sequences. I then performed optimal matching on the full data set, and hierarchical cluster analysis using Ward's method with the following commands:
sqset Seqvar IDvar time
sqclusterdat
clustermat wardslinkage SQdist, name(wards) add
cluster generate clustervar = groups(2/10), ties(more) name(wards)
At this point, I am able to see how my cluster variables are distributed using the command tab1 clustervar*. However, when I attempt to link the clustered data with sequence data, by running
sqclusterdat, return keep(clustervar*)
I am met with the following message:
What am I missing?
Thanks!
Code:
IDvar Seqvar Time 1 2 1 1 2 2 1 0 3 1 0 4 2 1 1 2 0 2 2 0 3 2 0 4
sqset Seqvar IDvar time
sqclusterdat
clustermat wardslinkage SQdist, name(wards) add
cluster generate clustervar = groups(2/10), ties(more) name(wards)
At this point, I am able to see how my cluster variables are distributed using the command tab1 clustervar*. However, when I attempt to link the clustered data with sequence data, by running
sqclusterdat, return keep(clustervar*)
I am met with the following message:
Code:
Group results could not be merged to sequence data. Returned to original data
What am I missing?
Thanks!