How to identify Cluster labels in kmeans scikit learn

Tags:

I am learning python scikit. The example given here displays the top occurring words in each Cluster and not Cluster name.

http://scikit-learn.org/stable/auto_examples/document_clustering.html

I found that the km object has "km.label" which lists the centroid id, which is the number.

I have two question

1. How do I generate the cluster labels?
2. How to identify the members of the clusters for further processing.

I have working knowledge of k-means and aware of tf-ids concepts.

842

asked Feb 05 '15 13:02

vij555

1 Answers

How do I generate the cluster labels?

I'm not sure what you mean by this. You have no cluster labels other than cluster 1, cluster 2, ..., cluster n. That is why it's called unsupervised learning, because there are no labels.

Do you mean you actually have labels and you want to see if the clustering algorithm happened to cluster the data according to your labels?

In that case, the documentation you linked to provides an example:

print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels, km.labels_))
print("Completeness: %0.3f" % metrics.completeness_score(labels, km.labels_))
print("V-measure: %0.3f" % metrics.v_measure_score(labels, km.labels_))

How to identify the members of the clusters for further processing.

See the documentation for KMeans. In particular, the predict method:

predict(X)

Parameters: X : {array-like, sparse matrix}, shape = [n_samples, n_features] New data to predict.

Returns:
labels : array, shape [n_samples,] Index of the cluster each sample belongs to.

If you don't want to predict something new, km.labels_ should do that for the training data.

answered Sep 21 '22 18:09

IVlad

Related questions
                            
                                Detect whether sequence is a multiple of a subsequence in Python
                            
                                What are some Python libraries written to demostrate Functional Reactive Programming? [closed]
                            
                                Determine if string input could be a valid directory in Python
                            
                                Find starting and ending indices of sublist in list
                            
                                django admin - select reverse foreign key relationships (not create, I want to add available)
                            
                                Prevent CSS/other resource download in PhantomJS/Selenium driven by Python
                            
                                No module named 'x' when reloading with os.execl()
                            
                                Modify file create / access / write timestamp with python under windows
                            
                                Custom user model in django
                            
                                Why is json.loads an order of magnitude faster than ast.literal_eval?
                            
                                Spyder IDE: How do you configure default end-of-line character?
                            
                                Pandas MultiIndex versus Panel
                            
                                How to make a rest_framework Serializer disallow superfluous fields?
                            
                                How to embed an interactive matplotlib plot in a webpage
                            
                                How can a plug-in enhance Anki's JavaScript?
                            
                                Fast checking of ranges in Python
                            
                                How to compress csv file into zip archive directly?
                            
                                Pandas aggregation ignoring NaN's
                            
                                Time differentiation in Pandas
                            
                                Python unable to find Elasticsearch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to identify Cluster labels in kmeans scikit learn

Tags:

python

machine-learning

cluster-analysis

k-means

scikit-learn

vij555

People also ask

1 Answers

IVlad

Recent Activity

Donate For Us