Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use pdist() in python with a custom distance function defined by you

I have been interested in usage of scipy.spatial.distance.pdist(...) in python which has come to be useful and fast for some of the applications I have been working on.

I need to use a pairwise distance function which are custom and not standard default distance metrics as defined by the metric. Let's make a simple example, suppose I do not want to use euclidean distance function as the following:

 Y = pdist(X, 'euclidean')

Instead I want to define the euclidean function myself and pass it as a function or argument to pdist(). How can I pass the implementation of euclidean distance function to this function to get exactly the same results. The answer to this question, will help me to use the function in the way I am interested in.

In MATLAB, I know how to use pdist(), in Python I don't yet. Thanks for your suggestion

like image 387
Yas Avatar asked Apr 05 '16 09:04

Yas


1 Answers

There is an example in the documentation for pdist:

import numpy as np
from scipy.spatial.distance import pdist

dm = pdist(X, lambda u, v: np.sqrt(((u-v)**2).sum()))

If you want to use a regular function instead of a lambda function the equivalent would be

import numpy as np
from scipy.spatial.distance import pdist

def dfun(u, v):
    return np.sqrt(((u-v)**2).sum())

dm = pdist(X, dfun)
like image 196
Pelle Nilsson Avatar answered Nov 20 '22 10:11

Pelle Nilsson