Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimize performance for calculation of euclidean distance between two images

I implemented the k-nearest-neighbours algorithm in python to classify some randomly picked images from the mnist database. However I found my distance function to be quite slow: An analisys of 10 test images against the training set of 10k images takes about 2mins. The images have a resolution of 28x28 pixels. Since I'm new to python I got the feeling this could be faster. The function is supposed to calculate the euclidean distance between two same-sized grayscale images.

def calculateDistance(image1, image2):
    distance = 0
    for i in range(len(image1)):
        for j in range(len(image1)):
            distance += math.pow((image1[i][j]-image2[i][j]),2)
    distance = numpy.sqrt(distance)
    return distance
like image 677
wodzu Avatar asked Jan 08 '23 08:01

wodzu


1 Answers

If you're using numpy arrays to represent the images, you could use the following instead:

def calculateDistance(i1, i2):
    return numpy.sum((i1-i2)**2)

This should be much faster because it uses a fast C implementation for the heavy lifting. Also consider using caching to not compute the difference of two images twice.

like image 129
Niklas B. Avatar answered Jan 09 '23 23:01

Niklas B.