Here is my simple example of dealing with data clustering in 3 attribute(x,y,value). each sample represent its location(x,y) and its belonging variable.
My code was post here:
x = np.arange(100,200,1)
y = np.arange(100,200,1)
value = np.random.random(100*100)
xx,yy = np.meshgrid(x,y)
xx = xx.reshape(100*100)
yy = yy.reshape(100*100)
j = np.dstack((xx,yy,value))[0,:,:]
fig = plt.figure(figsize =(12,4))
ax1 = plt.subplot(121)
xi,yi = np.meshgrid(x,y)
va = value.reshape(100,100)
pc = plt.pcolormesh(xi,yi,va,cmap = plt.cm.Spectral)
plt.colorbar(pc)
ax2 = plt.subplot(122)
y_pred = KMeans(n_clusters=12, random_state=random_state).fit_predict(j)
vb = y_pred.reshape(100,100)
plt.pcolormesh(xi,yi,vb,cmap = plt.cm.Accent)
The figure are presented here:
How to identify the boundaries of each cluster zone and outline them to intensify the visualization effect.
Here is an illustration I plot manually. To identify the clustering boundaries and depict them in lines is what I need.
I found an interesting question here trying to draw the boundaries of cluster area in R
After I tried the subroutine follows:
for i in range(n_cluster):
plt.contour(vb ==i contours=1,colors=['b'])
It's done!
We will also plot the cluster centers as determined by the k-means estimator: In [4]: plt. scatter(X[:, 0], X[:, 1], c=y_kmeans, s=50, cmap='viridis') centers = kmeans.
Single-Line Decision Boundary: The basic strategy to draw the Decision Boundary on a Scatter Plot is to find a single line that separates the data-points into regions signifying different classes.
The cluster zones are actually just a Voronoi diagram of the cluster centers. Scipy has some tools for computing Voronoi cells given a set of points. This page has some examples on how you can do this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With