Affinity propagation preference parameter

Question

I've had encouraging results clustering a set of entity names using scikit-learn's affinity propagation implementation, with a modified Jaro-Winkler distance as the similarity metric, but my clusters are still too numerous (ie. too many false positives.)

I see in the scikit-learn documentation that there exists a 'preference' parameter that affects the number of clusters, with the following description:

preference : array-like, shape (n_samples,) or float, optional

Preferences for each point - points with larger values of preferences are more likely to be chosen as exemplars. The number of exemplars, ie of clusters, is influenced by the input preferences value. If the preferences are not passed as arguments, they will be set to the median of the input similarities.[0]

However, when I began tinkering with this value, I found that a very narrow range of values was giving me either too many clusters (preference=-11.13) or too few clusters (preference=-11.11).

Is there some way to determine what a 'reasonable' value of the preference parameter should be? And why would it be that I'm unable to obtain a non-extreme number of clusters?

Similar questions:

Affinity Propagation - Cluster Imbalance

Affinity Propagation preferences initialization

Erotemic · Accepted Answer

You could try using sklearn.model_selection.GridSearchCV or sklearn.model_selection.RandomizedSearchCV.

You could define a custom error measure that encourages the hyper-parameter search to generate smaller clusters. Then you can search several values to find one that is good for your dataset based on a validation set.

More info: http://scikit-learn.org/stable/modules/grid_search.html

Affinity propagation preference parameter

Tags:

python

cluster-analysis

unsupervised-learning

scikit-learn

nitrl

1 Answers

Erotemic

Recent Activity

Donate For Us

Affinity propagation preference parameter

Tags:

python

cluster-analysis

unsupervised-learning

scikit-learn

nitrl

1 Answers

Erotemic

Related questions

Recent Activity

Donate For Us