Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • k-mean clustering using existing similarity matrix

    I have a dataset of workers, and I want to divide them into clusters based on some observables, one of them is categorical (industry). I'm trying to do k-mean clustering. Instead of using many dummy variables for the different categories, I built a similarity measure between each pair of industries and want to use it in the k-mean algorithm. My question is how can I use this pre-existing similarity matrix in the k-mean computation, together with other continuous variables (e.g., education). The way I'm doing it right now is first to collapse the similarity matrix into 2 or 3 dimensions using multidimensional scaling process and then use the results in the k-mean method. Is there a way to use the similarity matrix directly?
Working...
X