This question is somewhat language-agnostic, but my tool of choice happens to be a numpy array.
What I am doing is taking the difference of two images via PIL:
img = ImageChops.difference(img1, img2)
And I want to find the rectangular regions that contain changes from one picture to another. Of course there's the built in .getbbox()
method, but if there are two regions with changes it will return a box from one region to another, and if there are only 1 pixel changes in each corner it will return the whole image.
For instance consider the following where o
is a non-zero pixel:
______________________
|o ooo |
| oooo ooo |
| o |
| o o |
| |
| oo o |
| o o ooo |
| oo ooooo |
| ooo |
| o |
|____________________|
I'd like to get 4x4-tuples containing the bounding boxes for each non-zero region. For the edge case of the
oooo
o
o o
structure, I'm not terribly worried how that's handled - either getting both sections separately or together, because the bounds of the inverted-L shape will completely overlap the bounds of the single pixel.
I've never done anything this advanced with image processing so I wanted to get some input before I really write anything (and if there are pre-existing methods in the modules I'm already using, I welcome them!).
My psuedocode-ish version goes something like this:
for line in image:
started = False
for pixel in line:
if pixel and not started:
started = True
save start coords
elif started and not pixel:
started = False
save end coords (x - 1 of course)
This should give me a list of coordinates, but then I have to determine if the regions are contiguous. I could do that with a graph-type search? (We did plenty of DFS and BFS in Algorithms last semester) Of course I guess I could do that instead/in conjunction with my previous loops?
I won't be doing this on "large" images - they're pulled from a webcam and the best one I currently have does 640x480. At most I'd be doing 720p or 1080p, but that's far enough into the future that it's not a real concern.
So my question(s): Am I headed on the right path, or am I way off? And more important, are there any built-in functions that prevent me from re-inventing the wheel? And finally, are there any good resources I should look at (tutorials, papers, etc.) that will help out here?
Thanks!
I believe scipy's ndimage module has everything you need...
Here's a quick example
import numpy as np
import scipy as sp
import scipy.ndimage.morphology
# The array you gave above
data = np.array(
[
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
])
# Fill holes to make sure we get nice clusters
filled = sp.ndimage.morphology.binary_fill_holes(data)
# Now seperate each group of contigous ones into a distinct value
# This will be an array of values from 1 - num_objects, with zeros
# outside of any contigous object
objects, num_objects = sp.ndimage.label(filled)
# Now return a list of slices around each object
# (This is effectively the tuple that you wanted)
object_slices = sp.ndimage.find_objects(objects)
# Just to illustrate using the object_slices
for obj_slice in object_slices:
print data[obj_slice]
This outputs:
[[1]]
[[1 1 1]
[1 1 1]]
[[1 1 1 1]
[1 0 0 0]
[1 0 0 1]]
[[1]]
[[0 1 1 0]
[1 0 0 1]
[0 1 1 0]]
[[0 0 1 0 0]
[0 1 1 1 0]
[1 1 1 1 1]
[0 1 1 1 0]
[0 0 1 0 0]]
Note that the "object_slices" are basically what you originally asked for, if you need the actual indicies.
Edit: Just wanted to point out that despite it appearing to properly handle the edge case of
[[1 1 1 1]
[1 0 0 0]
[1 0 0 1]]
it actually doesn't (Thus the extra lone [[1]]). You can see this if you print out the "objects" array and take a look at objects 3 & 4.
[[1 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 0 0 0 0]
[0 0 0 0 0 0 3 3 3 3 0 0 0 2 2 2 0 0 0 0]
[0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 3 0 0 4 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 5 5 0 0 0 0 0 0 0 6 0 0 0 0 0]
[0 0 0 0 5 5 5 5 0 0 0 0 0 6 6 6 0 0 0 0]
[0 0 0 0 0 5 5 0 0 0 0 0 6 6 6 6 6 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 6 6 6 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0]]
Hope that helps!
[1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With