I have an image, stored in a numpy array of uint8
s, of shape (planes, rows, cols)
. I need to compare it to the values stored in a mask, also of uint8
s, of shape (mask_rows, mask_cols)
. While the image may be very large, the mask is usually smallish, normally (256, 256)
and is to be tiled over image
. To simplify the code, lets pretend that rows = 100 * mask_rows
and cols = 100 * mask_cols
.
The way I am currently handling this thresholding is something like this:
out = image >= np.tile(mask, (image.shape[0], 100, 100))
The largest array I can process this way before getting slapped in the face with a MemoryError
is a little larger than (3, 11100, 11100)
. The way I figured it, doing things this way I have up to three ginormous arrays coexisting in memory: image
, the tiled mask
, and my return out
. But the tiled mask is the same little array copied over and over 10,000 times. So if I could spare that memory, I would use only 2/3 the memory, and should be able to process images 3/2 larger, so of size around (3, 13600, 13600)
. This is, by the way, consistent with what I get if I do the thresholding in place with
np.greater_equal(image, (image.shape[0], 100, 100), out=image)
My (failed) attempt at exploiting the periodic nature of mask
to process larger arrays has been to index mask
with periodic linear arrays:
mask = mask[None, ...]
rows = np.tile(np.arange(mask.shape[1], (100,))).reshape(1, -1, 1)
cols = np.tile(np.arange(mask.shape[2], (100,))).reshape(1, 1, -1)
out = image >= mask[:, rows, cols]
For small arrays it does produce the same result as the other one, although with something of a 20x slowdown(!!!), but it terribly fails to perform for the larger sizes. Instead of a MemoryError
it eventually crashes python, even for values that the other method handles with no problems.
What I think is happening is that numpy is actually constructing the (planes, rows, cols)
array to index mask
, so not only is there no memory saving, but since it is an array of int32
s, it is actually taking four times more space to store...
Any ideas on how to go about this? To spare you the trouble, find below some sandbox code to play around with:
import numpy as np
def halftone_1(image, mask) :
return np.greater_equal(image, np.tile(mask, (image.shape[0], 100, 100)))
def halftone_2(image, mask) :
mask = mask[None, ...]
rows = np.tile(np.arange(mask.shape[1]),
(100,)).reshape(1, -1, 1)
cols = np.tile(np.arange(mask.shape[2]),
(100,)).reshape(1, 1, -1)
return np.greater_equal(image, mask[:, rows, cols])
rows, cols, planes = 6000, 6000, 3
image = np.random.randint(-2**31, 2**31 - 1, size=(planes * rows * cols // 4))
image = image.view(dtype='uint8').reshape(planes, rows, cols)
mask = np.random.randint(256,
size=(1, rows // 100, cols // 100)).astype('uint8')
#np.all(halftone_1(image, mask) == halftone_2(image, mask))
#halftone_1(image, mask)
#halftone_2(image, mask)
import timeit
print timeit.timeit('halftone_1(image, mask)',
'from __main__ import halftone_1, image, mask',
number=1)
print timeit.timeit('halftone_2(image, mask)',
'from __main__ import halftone_2, image, mask',
number=1)
I would almost have pointed you to a rolling window type of trick, but for this simple non-overlapping thing, normal reshape does it just as well. (the reshapes here are safe, numpy will never make a copy for them)
def halftone_reshape(image, mask):
# you can make up a nicer reshape code maybe, it is a bit ugly. The
# rolling window code can do this too (but much more general then reshape).
new_shape = np.array(zip(image.shape, mask.shape))
new_shape[:,0] /= new_shape[:,1]
reshaped_image = image.reshape(new_shape.ravel())
reshaped_mask = mask[None,:,None,:,None,:]
# and now they just broadcast:
result_funny_shaped = reshaped_image >= reshaped_mask
# And you can just reshape it back:
return result_funny_shaped.reshape(image.shape)
And since timings are everything (not really but...):
In [172]: %timeit halftone_reshape(image, mask)
1 loops, best of 3: 280 ms per loop
In [173]: %timeit halftone_1(image, mask)
1 loops, best of 3: 354 ms per loop
In [174]: %timeit halftone_2(image, mask)
1 loops, best of 3: 3.1 s per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With