I made a small Python 3.x app for myself that resizes all the images from a folder by a certain given percentage.
The app suports multicore CPU, as in it splits the work done on as many threads as the CPU has.
The bottleneck here is the CPU, as my RAM memory remains 40% free and my HDD usage is 3% during runtime, but all CPU cores are near 100%.
Is there a way to process the images on the GPU? I think it would greatly improve performance as GPU have more than 4 cores.
Here is a bit of code on how the processing is done:
def worker1(file_list, percentage, thread_no):
"""thread class"""
global counter
save_dir = askdir_entry.get() + '/ResizeImage/'
for picture in file_list:
image = Image.open(picture, mode='r')
image_copy = image.copy()
(width, height) = image.size
filename = os.path.split(picture)[1]
image_copy.thumbnail((width * (int(percentage) / 100), height * (int(percentage) / 100)))
info_area.insert('end', '\n' + filename)
info_area.see(tkinter.END)
image_copy.save(save_dir + filename)
counter += 1
if counter % 3 == 0:
update_counter(1, thread_no)
update_counter(0, thread_no)
def resize():
global start_time
start_time = timeit.default_timer()
percentage = percentage_textbox.get()
if not percentage:
info_area.insert('end', 'Please write a percentage!')
return
askdir_entry.config(state='disabled')
percentage_textbox.config(state='disabled')
file_list = glob.glob(askdir_entry.get() + '/*.jp*g')
info_area.insert('end', 'Found ' + str(len(file_list)) + ' pictures.\n')
cpu = multiprocessing.cpu_count()
info_area.insert('end', 'Number of threads: ' + str(cpu))
info_area.insert('end', '\nResizing pictures..\n\n')
if not os.path.exists(askdir_entry.get() + '/ResizeImage'):
os.makedirs(askdir_entry.get() + '/ResizeImage')
counter_label.config(text='-')
for i in range(0, cpu):
file_list_chunk = file_list[int(i * len(file_list) / cpu):int((i + 1) * len(file_list) / cpu)]
threading.Thread(target=worker1, args=(file_list_chunk, percentage, i + 1)).start()
A PIL simulation, which requires target connectivity, compiles generated source code, and then downloads and runs object code on NVIDIA® GPU platforms.
When doing image processing, we need fast access to pixel values. GPUs are designed for graphical purposes, and one of them is texturing, therefore the hardware for accessing and manipulating pixels is well optimized.
Thus, running a python script on GPU can prove to be comparatively faster than CPU, however, it must be noted that for processing a data set with GPU, the data will first be transferred to the GPU's memory which may require additional time so if data set is small then CPU may perform better than GPU.
Image resize is not actually very CPU-intensive. You'll find that a lot of your overall time is being spent in the image decode and encode libraries where a GPU is of little help.
A simple thing would be to try swapping PIL out for pillow-simd. It's compatible with pillow, but many inner loops have been replaced with hand-written vector code. You can typically expect a 6x to 10x speedup for the image resizing step.
libjpeg supports very fast shrink on load. It can do a x2, x4 or x8 shrink as part of image decode -- you can easily get a 20x speedup for large shrinks. You'd need to look into how to enable this in pillow.
You could also consider other image processing libraries. libvips has a fast and low memory command-line tool for image shrinking, vipsthumbnail
. Combined with GNU parallel, you can easily get a huge speedup.
For example, I can make a directory of 1,000 large JPG images:
$ vipsheader ../nina.jpg
../nina.jpg: 6048x4032 uchar, 3 bands, srgb, jpegload
$ for i in {1..1000}; do cp ../nina.jpg $i.jpg; done
Then shrink with imagemagick like this:
$ time for i in {1..1000}; do convert $i.jpg -resize 128x128 tn_$i.jpg; done
real 6m43.627s
user 31m29.894s
sys 1m51.352s
Or with GNU parallel
and vipsthumbnail
like this:
$ time parallel vipsthumbnail -s 128 ::: *.jpg
real 0m11.940s
user 1m15.820s
sys 0m11.916s
About 33x faster.
You could use convert
with parallel
, but each convert
process needs about 400mb of ram with a 6k x 4k JPG image, so it would be easy to fill memory. You'd probably need to tune it a bit. vipsthumbnail
only needs a few mb of ram, so you can safely run many instances at once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With