How to implement callable distance metric in scikit-learn?

Question

I'm using the clustering module in python's scikit learn, and I'd like to use a Normalized Euclidean Distance. There is no built-in distance for this (that i know of) Here's a list.

So, I want to implement my own Normalized Euclidean Distance using a callable. The function is part of my distance module and is called distance.normalized_euclidean_distance. It takes three inputs: X,Y, and SD.

However, Normalized Euclidean Distance requires standard deviation for the population sample. But, the pairwise distance in scipy only allows two inputs: X and Y.

How do I allow it to take an additional argument?

I tried putting it in as a **kwarg, but that didn't seem to work:

cluster = DBSCAN(eps=1.0, min_samples=1,metric = distance.normalized_euclidean, SD = stdv)

where distance.normalized_euclidean is the function that I wrote that takes in two arrays, X and Y and computes the normalized euclidean distance between them.

...but this throws an error:

TypeError: __init__() got an unexpected keyword argument 'SD'

What is the way to use additional keyword arguments?

Here it says Any further parameters are passed directly to the distance function., which made me think that this would be acceptable.

yangjie · Accepted Answer

You can use a lambda function as metric which takes two input arrays:

cluster = DBSCAN(eps=1.0, min_samples=1,metric=lambda X, Y: distance.normalized_euclidean(X, Y, SD=stdv))

How to implement callable distance metric in scikit-learn?

Tags:

python

keyword-argument

euclidean-distance

scipy

scikit-learn

makansij

1 Answers

yangjie

Recent Activity

Donate For Us

How to implement callable distance metric in scikit-learn?

Tags:

python

keyword-argument

euclidean-distance

scipy

scikit-learn

makansij

1 Answers

yangjie

Related questions

Recent Activity

Donate For Us