Resize image faster in OpenCV Python

Tags:

I have a lot of image files in a folder (5M+). These images are of different sizes. I want to resize these images to 128x128.

I used the following function in a loop to resize in Python using OpenCV

def read_image(img_path):
    # print(img_path)
    img = cv2.imread(img_path)
    img = cv2.resize(img, (128, 128))
    return img

for file in tqdm(glob.glob('train-images//*.jpg')):
    img = read_image(file)
    img = cv2.imwrite(file, img)

But it will take more than 7 hours to complete. I was wondering whether there are any method to speed up this process.

Can I implement parallel processing to do this efficiently with dask or something.? If so how is it possible.?

233

asked Nov 04 '18 05:11

Sreeram TP

1 Answers

If you are absolutely intent on doing this in Python, then please just disregard my answer. If you are interested in getting the job done simply and fast, read on...

I would suggest GNU Parallel if you have lots of things to be done in parallel and even more so as CPUs become "fatter" with more cores rather than "taller" with higher clock rates (GHz).

At its simplest, you can use ImageMagick just from the command line in Linux, macOS and Windows like this to resize a bunch of images:

magick mogrify -resize 128x128\! *.jpg

If you have hundreds of images, you would be better running that in parallel which would be:

parallel magick mogrify -resize 128x128\! ::: *.jpg

If you have millions of images, the expansion of *.jpg will overflow your shell's command buffer, so you can use the following to feed the image names in on stdin instead of passing them as parameters:

find -iname \*.jpg -print0 | parallel -0 -X --eta magick mogrify -resize 128x128\!

There are two "tricks" here:

I use find ... -print0 along with parallel -0 to null-terminate filenames so there are no problems with spaces in them,
I use parallel -X which means, rather than start a whole new mogrify process for each image, GNU Parallel works out how many filenames mogrify can accept, and gives it that many in batches.

I commend both tools to you.

Whilst the ImageMagick aspects of the above answer work on Windows, I don't use Windows and I am unsure about using GNU Parallel there. I think it maybe runs under git-bash and/or maybe under Cygwin - you could try asking a separate question - they are free!

As regards the ImageMagick part, I think you can get a listing of all the JPEG filenames in a file using this command:

DIR /S /B *.JPG > filenames.txt

You can then probably process them (not in parallel) like this:

magick mogrify -resize 128x128\! @filenames.txt

And if you find out how to run GNU Parallel on Windows, you can probably process them in parallel using something like this:

parallel --eta -a filenames.txt magick mogrify -resize 128x128\!

114

answered Oct 01 '22 17:10

Mark Setchell

Related questions
                            
                                What does iter() do to list?
                            
                                How to coerce string to datetime in Python Cerberus?
                            
                                how to identify highest value key in nested dictionary? [duplicate]
                            
                                pandas.DataFrame.describe() gives no output in .py script
                            
                                Django Built-in Login System - accounts/profile/ not found
                            
                                Python sort dictionary by descending values and then by keys alphabetically
                            
                                itertools group by multiple keys
                            
                                NamedTuple to Dataframe
                            
                                AttributeError: 'list' object has no attribute 'click' using Selenium and Python
                            
                                Import image in python
                            
                                Running flask on port 80 in linux [duplicate]
                            
                                pip install producing "Could not find a version that satisfies the requirement" [duplicate]
                            
                                Get inner-most elements from triple nested list Python
                            
                                zip()-like built-in function filling unequal lengths from left with None value
                            
                                Pandas - Insert blank row for each group in pandas
                            
                                Can't import google.cloud.vision
                            
                                iter() returned non-iterator of type 'dict_items'
                            
                                OpenCV Python Scripts Mac "aborts"
                            
                                How to convert all columns in Pandas DataFrame to 'object' while ignoring NaN?
                            
                                Replacing empty values in a DataFrame with value of a column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Resize image faster in OpenCV Python

Tags:

python

image-processing

opencv

Sreeram TP

People also ask

1 Answers

Mark Setchell

Recent Activity

Donate For Us