I've got 20,000 rectangular images that I want to centre-crop so that I can plug them into a machine learning algorithm.
Tensorflow has tf.image.central_crop() but I wanted to check the pictures before TF gets involved. That function takes a tensor and outputs a tensor.
What's the best tool for cropping them in Python?
EDIT: Alternatively, what's the best algorithm for computing centre crops?
You can do this easily and fast without writing any code using ImageMagick, which is installed on most Linux distros and is available for macOS and Windows.
In the Terminal, crop an image to the central 50x50 pixels like this, by setting the gravity
to center
and specifying a zero offset from that position:
magick input.png -gravity center -crop 50x50+0+0 result.png
If you want to crop to the largest square, you need to use a function to find the lesser of the height and width and use that for each of the 50s:
magick input.png -gravity center -crop "%[fx:h<w?h:w]x%[fx:h<w?h:w]+0+0" result.png
Start image:
Result:
Ok, now we want to do 20,000 images, so we use GNU Parallel, that would be:
parallel magick {} -gravity center -crop ... {} ::: *.png
But now we have some new problems. The list of filenames will be too long for ARG_MAX, so we need to feed the filenames in on stdin
from find
like this with null-termination:
find . -name \*.png -print0 | parallel -0 magick {} -gravity center -crop ... {}
We also have a new problem with the special characters in the -crop
expression, so we need to ask GNU Parallel to work out the quoting for us. So the final command becomes:
find . -name \*.png -print0 | parallel -0 --quote magick {} -gravity center -crop "%[fx:h<w?h:w]x%[fx:h<w?h:w]+0+0" {}
That is an extraordinarily powerful command that will rapidly change thousands of images, overwriting the originals, so please test it on a small subset of your images by copying them to somewhere safe first!
You can get a progress bar with:
parallel --bar ...
And you can do a "dry-run" asking GNU Parallel to show you what it would do without actually doing anything like this:
parallel --dry-run ...
There are ways of making this even faster and even easier to read - I may add them later when I have more time.
This is a simple method to crop an image:-
import Image
def crop_image(file_name,new_height,new_width):
im = Image.open(file_name+".jpg")
width, height = im.size
left = (width - new_width)/2
top = (height - new_height)/2
right = (width + new_width)/2
bottom = (height + new_height)/2
crop_im = im.crop((left, top, right, bottom)) #Cropping Image
crop_im.save(file_name+"_new.jpg") #Saving Images
new_width = 256 #Enter the crop image width
new_height = 256 #Enter the crop image height
file_name = ["google"] #Enter File Names
for i in file_name:
crop_image(i,new_height,new_width)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With