Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Python / PIL or similar to shrink whitespace

Any ideas how to use Python with the PIL module to shrink select all? I know this can be achieved with Gimp. I'm trying to package my app as small as possible, a GIMP install is not an option for the EU.

Say you have 2 images, one is 400x500, other is 200x100. They both are white with a 100x100 textblock somewhere within each image's boundaries. What I'm trying to do is automatically strip the whitespace around that text, load that 100x100 image textblock into a variable for further text extraction.

It's obviously not this simple, so just running the text extraction on the whole image won't work! I just wanted to query about the basic process. There is not much available on Google about this topic. If solved, perhaps it could help someone else as well...

Thanks for reading!

like image 579
user1145643 Avatar asked Feb 22 '12 14:02

user1145643


People also ask

Is pillow the same as PIL in Python?

What is PIL/Pillow? PIL (Python Imaging Library) adds many image processing features to Python. Pillow is a fork of PIL that adds some user-friendly features.

How do you trim an image in Python?

crop() method is used to crop a rectangular portion of any image. Parameters: box – a 4-tuple defining the left, upper, right, and lower pixel coordinate. Return type: Image (Returns a rectangular region as (left, upper, right, lower)-tuple).


2 Answers

If you put the image into a numpy array, it's simple to find the edges which you can use PIL to crop. Here I'm assuming that the whitespace is the color (255,255,255), you can adjust to your needs:

from PIL import Image
import numpy as np

im = Image.open("test.png")
pix = np.asarray(im)

pix = pix[:,:,0:3] # Drop the alpha channel
idx = np.where(pix-255)[0:2] # Drop the color when finding edges
box = map(min,idx)[::-1] + map(max,idx)[::-1]

region = im.crop(box)
region_pix = np.asarray(region)

To show what the results look like, I've left the axis labels on so you can see the size of the box region:

from pylab import *

subplot(121)
imshow(pix)
subplot(122)
imshow(region_pix)
show()

enter image description here

like image 65
Hooked Avatar answered Oct 13 '22 01:10

Hooked


The general algorithmn would be to find the color of the top left pixel, and then do a spiral scan inwards until you find a pixel not of that color. That will define one edge of your bounding box. Keep scanning until you hit one more of each edge.

like image 20
Tyler Eaves Avatar answered Oct 13 '22 01:10

Tyler Eaves