Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify clusters in SOM (Self Organizing Map)

Once I have collected and organized data in a SOM how do I identify clusters?

(Items are aggregated and clustered using many traits - upwards of 10)

Specifically I want to find the 'center' of the cluster - therefor giving me the 'center' node(s).

like image 939
Wallter Avatar asked Oct 25 '12 18:10

Wallter


People also ask

What is self-organizing map clustering?

A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data.

What are the five stages in self Organising map?

We saw that the self organization has two identifiable stages: ordering and convergence. 3. We ended with an overview of the SOM algorithm and its five stages: initialization, sampling, matching, updating, and continuation.

Is self-organizing map a clustering algorithm?

Self-Organizing Map (SOM) is one of the common unsupervised neural network models. SOM has been widely used for clustering, dimension reduction, and feature detection. SOM was first introduced by Professor Kohonen. For this reason, SOM also called Kohonen Map.


2 Answers

Though an old question I've encountered the same issue and I've had some success implementing Estimating the Number of Clusters in Multivariate Data by Self-Organizing Maps, so I thought I'd share.

The linked algorithm uses the U-matrix to highlight the boundaries of the individual clusters and then uses an image processing algorithm called watershedding to identify the components. For this to work correctly the regions in the u-matrix are required to be concave within the resolution of your quantization (which when converted to a binary image, simply results in using a floodfill to identify the regions).

like image 64
Lanting Avatar answered Oct 22 '22 01:10

Lanting


As far as I can tell, SOM is primarily a data-driven dimensionality reduction and data compression method. So it won't cluster the data for you; it may actually tend to spread clusters in the projection (i.e. split them into multiple cells).

However, it may work well for some data sets to either:

  • Instead of processing the full data set, work only on the SOM nodes (weighted by the number of elements assigned to them), which should be significantly smaller
  • Instead of working in the original space, work in the lower-dimensional space that the SOM represents

And then run a regular clustering algorithm on the transformed data.

like image 38
Has QUIT--Anony-Mousse Avatar answered Oct 22 '22 00:10

Has QUIT--Anony-Mousse