I have folder full of images with each image containing at least 4 smaller images. I would to know how I can cut the smaller images out using Python PIL so that they will all exist as independent image files. fortunately there is one constant, the background is either white or black so what I'm guessing I need is a way to the cut these images out by searching for rows or preferably columns which are entirely black or entirely white, Here is an example image:
From the image above, there would be 10 separate images, each containing a number. Thanks in advance.
EDIT: I have another sample image that is more realistic in the sense that the backgrounds of some of the smaller images are the same colour as the background of the image they are contained in. e.g.
The output of which being 13 separate images, each containng 1 letter
Using scipy.ndimage for labeling:
import numpy as np
import scipy.ndimage as ndi
import Image
THRESHOLD = 100
MIN_SHAPE = np.asarray((5, 5))
filename = "eQ9ts.jpg"
im = np.asarray(Image.open(filename))
gray = im.sum(axis=-1)
bw = gray > THRESHOLD
label, n = ndi.label(bw)
indices = [np.where(label == ind) for ind in xrange(1, n)]
slices = [[slice(ind[i].min(), ind[i].max()) for i in (0, 1)] + [slice(None)]
for ind in indices]
images = [im[s] for s in slices]
# filter out small images
images = [im for im in images if not np.any(np.asarray(im.shape[:-1]) < MIN_SHAPE)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With