Creating Clusters in matlab

Question

Suppose that I have generated some data in matlab as follows:

n = 100;

x = randi(n,[n,1]);
y = rand(n,1);
data = [x y];

plot(x,y,'rx')
axis([0 100 0 1])

Now I want to generate an algorithm to classify all these data into some clusters(which are arbitrary) in a way such that a point be a member of a cluster only if the distance between this point and at least one of the members of the cluster be less than 10.How could I generate the code?

saastn · Accepted Answer

The clustering method you are describing is DBSCAN. Note that this algorithm will find only one cluster in provided data, since it's very unlikely that there is a point in the dataset so that its distance to all other points is more than 10. If this is really what you want, you can use ِDBSCAN, or the one posted in FE, if you are using versions older than 2019a.

% Generating random points, almost similar to the data provided by OP 
data = bsxfun(@times, rand(100, 2), [100 1]);
% Adding more random points
for i=1:5
    mu = rand(1, 2)*100 -50;
    A = rand(2)*5;
    sigma = A*A'+eye(2)*(1+rand*2);%[1,1.5;1.5,3];
    data = [data;mvnrnd(mu,sigma,20)];
end
% clustering using DBSCAN, with epsilon = 10, and min-points = 1 as 
idx = DBSCAN(data, 10, 1);
% plotting clusters
numCluster = max(idx);
colors = lines(numCluster);
scatter(data(:, 1), data(:, 2), 30, colors(idx, :), 'filled')
title(['No. of Clusters: ' num2str(numCluster)])
axis equal

enter image description here

The numbers in above figure shows the distance between closest pairs of points in any two different clusters.

juju89 · Answer

The Matlab built-in function clusterdata() works well for what you're asking.

Here is how to apply it to your example:

% number of points
n = 100; 

% create the data
x = randi(n,[n,1]);
y = rand(n,1);
data = [x y]; 

% the number of clusters you want to create
num_clusters = 5; 

T1 = clusterdata(data,'Criterion','distance',...
'Distance','euclidean',...
'MaxClust', num_clusters)

scatter(x, y, 100, T1,'filled')

In this case, I used 5 clusters and used the Euclidean distance to be the metric to group the data points, but you can always change that (see documentation of clusterdata())

See the result below for 5 clusters with some random data.

enter image description here

Note that the data is skewed (x-values are from 0 to 100, and y-values are from 0 to 1), so the results are also skewed, but you could always normalize your data.

Creating Clusters in matlab

Tags:

matlab

cluster-analysis

data-analysis

MMd.NrC

2 Answers

saastn

juju89

Recent Activity

Donate For Us

Creating Clusters in matlab

Tags:

matlab

cluster-analysis

data-analysis

MMd.NrC

2 Answers

saastn

juju89

Related questions

Recent Activity

Donate For Us