Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to calculate sum of absolute differences between two images in Python?

I am trying to compare images in a Python 3 application that uses Pillow and, optionally, Numpy. For compatibility reasons, I don't intend to use other external non pure-Python packages. I found this Pillow based algorithm in Roseta Code and it may serve my purpose, but it takes some time:

from PIL import Image

def compare_images(img1, img2):
    """Compute percentage of difference between 2 JPEG images of same size
    (using the sum of absolute differences). Alternatively, compare two bitmaps
    as defined in basic bitmap storage. Useful for comparing two JPEG images
    saved with a different compression ratios.

    Adapted from:
    http://rosettacode.org/wiki/Percentage_difference_between_images#Python

    :param img1: an Image object
    :param img2: an Image object
    :return: A float with the percentage of difference, or None if images are
    not directly comparable.
    """

    # Don't compare if images are of different modes or different sizes.
    if (img1.mode != img2.mode) \
            or (img1.size != img2.size) \
            or (img1.getbands() != img2.getbands()):
        return None

    pairs = zip(img1.getdata(), img2.getdata())
    if len(img1.getbands()) == 1:
        # for gray-scale jpegs
        dif = sum(abs(p1 - p2) for p1, p2 in pairs)
    else:
        dif = sum(abs(c1 - c2) for p1, p2 in pairs for c1, c2 in zip(p1, p2))

    ncomponents = img1.size[0] * img1.size[1] * 3
    return (dif / 255.0 * 100) / ncomponents  # Difference (percentage)

Trying to find alternatives, I discovered that this function could be rewritten using Numpy:

import numpy as np    
from PIL import Image

def compare_images_np(img1, img2):
    if (img1.mode != img2.mode) \
            or (img1.size != img2.size) \
            or (img1.getbands() != img2.getbands()):
        return None

    dif = 0
    for band_index, band in enumerate(img1.getbands()):
        m1 = np.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size)
        m2 = np.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size)
        dif += np.sum(np.abs(m1-m2))

    ncomponents = img1.size[0] * img1.size[1] * 3
    return (dif / 255.0 * 100) / ncomponents  # Difference (percentage)

I was expecting an improvement in processing speed, but actually it takes a little longer. I have no experience with Numpy, beyond the basics, so I wonder if there is any way to make it faster, for instance using some algorithm that does not imply that for loop. Any ideas?

like image 490
Victor Domingos Avatar asked Jan 18 '26 03:01

Victor Domingos


1 Answers

I think I understand what you are trying to do. I have no idea of the relative performance of our two machines so maybe you can benchmark it yourself.

from PIL import Image
import numpy as np

# Load images, convert to RGB, then to numpy arrays and ravel into long, flat things
a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

# Calculate the sum of the absolute differences divided by number of elements
MAE = np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]

The only "tricky" thing in there is the forcing of the result type of np.subtract() to a float which ensures I can store negative numbers. It may be worth trying with dtype=np.int16 on your hardware to see if that is faster.


A fast way to benchmark it is as follows. Start ipython and then type in the following:

from PIL import Image
import numpy as np

a=np.array(Image.open('a.png').convert('RGB')).ravel()
b=np.array(Image.open('b.png').convert('RGB')).ravel()

Now you can time my code with:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]
6.72 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Or, you can try an int16 version like this:

%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.int16))) / a.shape[0]
6.43 µs ± 30.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If you want to time your code, paste in your function then use:

%timeit compare_images_pil(img1, img2)
like image 159
Mark Setchell Avatar answered Jan 20 '26 19:01

Mark Setchell



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!