Affinity Propagation (sklearn) - strange behavior

Tags:

scikit-learn

Trying to use affinity propagation for a simple clustering task:

from sklearn.cluster import AffinityPropagation
c = [[0], [0], [0], [0], [0], [0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)

I get this strange result: [0 1 0 1 2 1 1 0]

I would expect to have all samples in the same cluster, like in this case:

c = [[0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)

which indeed puts all samples in the same cluster: [0 0 0]

What am I missing?

Thanks

555

asked Jun 14 '15 13:06

1 Answers

I believe this is because your problem is essentially ill-posed (you pass lots of the same point to an algorithm which is trying to find similarity between different points). AffinityPropagation is doing matrix math under the hood, and your similarity matrix (which is all zeros) is nastily degenerate. In order to not error out, the implementation adds a small random matrix to the similarity matrix, preventing the algorithm from quitting when it encounters two of the same point.

120

answered Sep 23 '22 07:09

Andreus

Related questions
                            
                                Canonical Correlation Analysis in Python with sklearn
                            
                                Extracting attributes from images using Scikit-image
                            
                                Scikit Logistic Regression summary output?
                            
                                Which algorithms can be used to generate a euclidean embedding for a manifold given a pairwise distance matrix of geodesics?
                            
                                unable to update scikit-learn to version 0.20
                            
                                How to do groupKfold validation and have balanced data?
                            
                                Cross-validate precision, recall and f1 together with sklearn
                            
                                Is it possible to store python objects (specifically sklearn models) in memory mapped files?
                            
                                Unequal misclassification costs in python/sklearn
                            
                                How to specify the prior for scikit-learn's Gaussian process regression?
                            
                                understand sklearn QuantileTransformer
                            
                                Creating a sklearn.linear_model.LogisticRegression instance from existing coefficients
                            
                                convert text columns into numbers in sklearn
                            
                                Apply sklearn trained model on a dataframe with PySpark
                            
                                Determine WHY Features Are Important in Decision Tree Models
                            
                                Numpy View Reshape Without Copy (2d Moving/Sliding Window, Strides, Masked Memory Structures)
                            
                                Scikit-learn custom score function needs values from dataset other than X and y
                            
                                Python - Scikit find variable importance for categorical variables
                            
                                Categorical & Numerical Features - Categorical Target - Scikit Learn - Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Affinity Propagation (sklearn) - strange behavior

Tags:

cluster-analysis

scikit-learn

Baba

People also ask

1 Answers

Andreus

Recent Activity

Donate For Us