k-mean clustering using existing similarity matrix

muly san

Join Date: Feb 2017

Posts: 15
#1

k-mean clustering using existing similarity matrix

11 Jan 2019, 04:03

I have a dataset of workers, and I want to divide them into clusters based on some observables, one of them is categorical (industry). I'm trying to do k-mean clustering. Instead of using many dummy variables for the different categories, I built a similarity measure between each pair of industries and want to use it in the k-mean algorithm. My question is how can I use this pre-existing similarity matrix in the k-mean computation, together with other continuous variables (e.g., education). The way I'm doing it right now is first to collapse the similarity matrix into 2 or 3 dimensions using multidimensional scaling process and then use the results in the k-mean method. Is there a way to use the similarity matrix directly?
Tags: None

Announcement