Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to generate image thumbnails in Python?

I'm building a photo gallery in Python and want to be able to quickly generate thumbnails for the high resolution images.

What's the fastest way to generate high quality thumbnails for a variety of image sources?

Should I be using an external library like imagemagick, or is there an efficient internal way to do this?

The dimensions of the resized images will be (max size):

120x120 720x720 1600x1600 

Quality is an issue, as I want to preserve as many of the original colors as possible and minimize compression artifacts.

Thanks.

like image 882
ensnare Avatar asked Dec 25 '11 19:12

ensnare


People also ask

How do I create a thumbnail image in Python?

In the above code, we first import PIL library. Then we create thumbnails() function, where we use open() function to open the image file, which returns an object. We call thumbnail() function on this image object and input thumbnail size of 100px x 90px. Next, we call save() function to save the image as thumbnail.

How do I create a thumbnail from a video in Python?

You can use ffmpeg-python to create a thumbnail and upload it for use with your video. If you don't want to get into learning FFMPEG, you can also use api. video's endpoint that allows you to pick a time from the video, and have it set as the video thumbnail for you automatically.


2 Answers

I fancied some fun so I did some benchmarking on the various methods suggested above and a few ideas of my own.

I collected together 1000 high resolution 12MP iPhone 6S images, each 4032x3024 pixels and use an 8-core iMac.

Here are the techniques and results - each in its own section.


Method 1 - Sequential ImageMagick

This is simplistic, unoptimised code. Each image is read and a thumbnail is produced. Then it is read again and a different sized thumbnail is produced.

#!/bin/bash  start=$SECONDS # Loop over all files for f in image*.jpg; do    # Loop over all sizes    for s in 1600 720 120; do       echo Reducing $f to ${s}x${s}       convert "$f" -resize ${s}x${s} t-$f-$s.jpg    done done echo Time: $((SECONDS-start)) 

Result: 170 seconds


Method 2 - Sequential ImageMagick with single load and successive resizing

This is still sequential but slightly smarter. Each image is only read one time and the loaded image is then resized down three times and saved at three resolutions. The improvement is that each image is read just once, not 3 times.

#!/bin/bash  start=$SECONDS # Loop over all files N=1 for f in image*.jpg; do    echo Resizing $f    # Load once and successively scale down    convert "$f"                              \       -resize 1600x1600 -write t-$N-1600.jpg \       -resize 720x720   -write t-$N-720.jpg  \       -resize 120x120          t-$N-120.jpg    ((N=N+1)) done echo Time: $((SECONDS-start)) 

Result: 76 seconds


Method 3 - GNU Parallel + ImageMagick

This builds on the previous method, by using GNU Parallel to process N images in parallel, where N is the number of CPU cores on your machine.

#!/bin/bash  start=$SECONDS  doit() {    file=$1    index=$2    convert "$file"                               \       -resize 1600x1600 -write t-$index-1600.jpg \       -resize 720x720   -write t-$index-720.jpg  \       -resize 120x120          t-$index-120.jpg }  # Export doit() to subshells for GNU Parallel    export -f doit  # Use GNU Parallel to do them all in parallel parallel doit {} {#} ::: *.jpg  echo Time: $((SECONDS-start)) 

Result: 18 seconds


Method 4 - GNU Parallel + vips

This is the same as the previous method, but it uses vips at the command-line instead of ImageMagick.

#!/bin/bash  start=$SECONDS  doit() {    file=$1    index=$2    r0=t-$index-1600.jpg    r1=t-$index-720.jpg    r2=t-$index-120.jpg    vipsthumbnail "$file"  -s 1600 -o "$r0"    vipsthumbnail "$r0"    -s 720  -o "$r1"    vipsthumbnail "$r1"    -s 120  -o "$r2" }  # Export doit() to subshells for GNU Parallel    export -f doit  # Use GNU Parallel to do them all in parallel parallel doit {} {#} ::: *.jpg  echo Time: $((SECONDS-start)) 

Result: 8 seconds


Method 5 - Sequential PIL

This is intended to correspond to Jakob's answer.

#!/usr/local/bin/python3  import glob from PIL import Image  sizes = [(120,120), (720,720), (1600,1600)] files = glob.glob('image*.jpg')  N=0 for image in files:     for size in sizes:       im=Image.open(image)       im.thumbnail(size)       im.save("t-%d-%s.jpg" % (N,size[0]))     N=N+1 

Result: 38 seconds


Method 6 - Sequential PIL with single load & successive resize

This is intended as an improvement to Jakob's answer, wherein the image is loaded just once and then resized down three times instead of re-loading each time to produce each new resolution.

#!/usr/local/bin/python3  import glob from PIL import Image  sizes = [(120,120), (720,720), (1600,1600)] files = glob.glob('image*.jpg')  N=0 for image in files:    # Load just once, then successively scale down    im=Image.open(image)    im.thumbnail((1600,1600))    im.save("t-%d-1600.jpg" % (N))    im.thumbnail((720,720))    im.save("t-%d-720.jpg"  % (N))    im.thumbnail((120,120))    im.save("t-%d-120.jpg"  % (N))    N=N+1 

Result: 27 seconds


Method 7 - Parallel PIL

This is intended to correspond to Audionautics' answer, insofar as it uses Python's multiprocessing. It also obviates the need to re-load the image for each thumbnail size.

#!/usr/local/bin/python3  import glob from PIL import Image from multiprocessing import Pool   def thumbnail(params):      filename, N = params     try:         # Load just once, then successively scale down         im=Image.open(filename)         im.thumbnail((1600,1600))         im.save("t-%d-1600.jpg" % (N))         im.thumbnail((720,720))         im.save("t-%d-720.jpg"  % (N))         im.thumbnail((120,120))         im.save("t-%d-120.jpg"  % (N))         return 'OK'     except Exception as e:          return e    files = glob.glob('image*.jpg') pool = Pool(8) results = pool.map(thumbnail, zip(files,range((len(files))))) 

Result: 6 seconds


Method 8 - Parallel OpenCV

This is intended to be an improvement on bcattle's answer, insofar as it uses OpenCV but it also obviates the need to re-load the image to generate each new resolution output.

#!/usr/local/bin/python3  import cv2 import glob from multiprocessing import Pool   def thumbnail(params):      filename, N = params     try:         # Load just once, then successively scale down         im = cv2.imread(filename)         im = cv2.resize(im, (1600,1600))         cv2.imwrite("t-%d-1600.jpg" % N, im)          im = cv2.resize(im, (720,720))         cv2.imwrite("t-%d-720.jpg" % N, im)          im = cv2.resize(im, (120,120))         cv2.imwrite("t-%d-120.jpg" % N, im)          return 'OK'     except Exception as e:          return e    files = glob.glob('image*.jpg') pool = Pool(8) results = pool.map(thumbnail, zip(files,range((len(files))))) 

Result: 5 seconds

like image 132
Mark Setchell Avatar answered Sep 25 '22 14:09

Mark Setchell


You want PIL it does this with ease

from PIL import Image sizes = [(120,120), (720,720), (1600,1600)] files = ['a.jpg','b.jpg','c.jpg']  for image in files:     for size in sizes:         im = Image.open(image)         im.thumbnail(size)         im.save("thumbnail_%s_%s" % (image, "_".join(size))) 

If you desperately need speed. Then thread it, multiprocess it or get another language.

like image 27
Jakob Bowyer Avatar answered Sep 23 '22 14:09

Jakob Bowyer