questions on clustering methods

Question

recently I came to study clustering in data-mining and I've studied sequential clustering and hierarchical clustering and k-means.

I also read about a statement that distinguishes k-means from the other two clustering technique,saying k-means is not very good at dealing with nominal attributes,but the text didn't explain this point.So far,the only difference that I can see is that for K-means,we will know in advance we will need exactly K clusters while we don't know how many clusters we need for other two clustering methods.

So could anybody give me some idea here on why such statement exists,i.e.,k-means has this problem when dealing with examples of nominal attributes and is there a way to overcome this?

Thanks in advance.

Stompchicken · Accepted Answer

The k-means algorithm calculates cluster centroids by taking the mean values of all the points in the cluster. If a parameter is nominal then you can't take an mean value.

Sometimes nominal values can be put into a kind of order and then mapped to real values. For example, days of the week could be mapped onto the range [1.0 - 7.0], but then again sometimes that isn't possible, for example an attribute with values [Windows, Linux, OSX].

questions on clustering methods

Tags:

artificial-intelligence

machine-learning

neural-network

data-mining

Kevin

1 Answers

Stompchicken

Recent Activity

Donate For Us

questions on clustering methods

Tags:

artificial-intelligence

machine-learning

neural-network

data-mining

Kevin

1 Answers

Stompchicken

Related questions

Recent Activity

Donate For Us