I implemented the k-nearest-neighbours algorithm in python to classify some randomly picked images from the mnist database. However I found my distance function to be quite slow: An analisys of 10 test images against the training set of 10k images takes about 2mins. The images have a resolution of 28x28 pixels. Since I'm new to python I got the feeling this could be faster. The function is supposed to calculate the euclidean distance between two same-sized grayscale images.
def calculateDistance(image1, image2):
distance = 0
for i in range(len(image1)):
for j in range(len(image1)):
distance += math.pow((image1[i][j]-image2[i][j]),2)
distance = numpy.sqrt(distance)
return distance
If you're using numpy arrays to represent the images, you could use the following instead:
def calculateDistance(i1, i2):
return numpy.sum((i1-i2)**2)
This should be much faster because it uses a fast C implementation for the heavy lifting. Also consider using caching to not compute the difference of two images twice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With