Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternative to `numpy.tile` for periodic mask

Tags:

python

numpy

I have an image, stored in a numpy array of uint8s, of shape (planes, rows, cols). I need to compare it to the values stored in a mask, also of uint8s, of shape (mask_rows, mask_cols). While the image may be very large, the mask is usually smallish, normally (256, 256) and is to be tiled over image. To simplify the code, lets pretend that rows = 100 * mask_rows and cols = 100 * mask_cols.

The way I am currently handling this thresholding is something like this:

out = image >= np.tile(mask, (image.shape[0], 100, 100))

The largest array I can process this way before getting slapped in the face with a MemoryError is a little larger than (3, 11100, 11100). The way I figured it, doing things this way I have up to three ginormous arrays coexisting in memory: image, the tiled mask, and my return out. But the tiled mask is the same little array copied over and over 10,000 times. So if I could spare that memory, I would use only 2/3 the memory, and should be able to process images 3/2 larger, so of size around (3, 13600, 13600). This is, by the way, consistent with what I get if I do the thresholding in place with

np.greater_equal(image, (image.shape[0], 100, 100), out=image)

My (failed) attempt at exploiting the periodic nature of mask to process larger arrays has been to index mask with periodic linear arrays:

mask = mask[None, ...]
rows = np.tile(np.arange(mask.shape[1], (100,))).reshape(1, -1, 1)
cols = np.tile(np.arange(mask.shape[2], (100,))).reshape(1, 1, -1)
out = image >= mask[:, rows, cols]

For small arrays it does produce the same result as the other one, although with something of a 20x slowdown(!!!), but it terribly fails to perform for the larger sizes. Instead of a MemoryError it eventually crashes python, even for values that the other method handles with no problems.

What I think is happening is that numpy is actually constructing the (planes, rows, cols) array to index mask, so not only is there no memory saving, but since it is an array of int32s, it is actually taking four times more space to store...

Any ideas on how to go about this? To spare you the trouble, find below some sandbox code to play around with:

import numpy as np

def halftone_1(image, mask) :
    return np.greater_equal(image, np.tile(mask, (image.shape[0], 100, 100)))

def halftone_2(image, mask) :
    mask = mask[None, ...]
    rows = np.tile(np.arange(mask.shape[1]),
                   (100,)).reshape(1, -1, 1)
    cols = np.tile(np.arange(mask.shape[2]),
                   (100,)).reshape(1, 1, -1)
    return np.greater_equal(image, mask[:, rows, cols])

rows, cols, planes = 6000, 6000, 3
image = np.random.randint(-2**31, 2**31 - 1, size=(planes * rows * cols // 4))
image = image.view(dtype='uint8').reshape(planes, rows, cols)
mask = np.random.randint(256,
                         size=(1, rows // 100, cols // 100)).astype('uint8')

#np.all(halftone_1(image, mask) == halftone_2(image, mask))
#halftone_1(image, mask)
#halftone_2(image, mask)

import timeit
print timeit.timeit('halftone_1(image, mask)',
                    'from __main__ import halftone_1, image, mask',
                    number=1)
print timeit.timeit('halftone_2(image, mask)',
                    'from __main__ import halftone_2, image, mask',
                    number=1)
like image 385
Jaime Avatar asked Jan 22 '13 19:01

Jaime


1 Answers

I would almost have pointed you to a rolling window type of trick, but for this simple non-overlapping thing, normal reshape does it just as well. (the reshapes here are safe, numpy will never make a copy for them)

def halftone_reshape(image, mask):
    # you can make up a nicer reshape code maybe, it is a bit ugly. The
    # rolling window code can do this too (but much more general then reshape).
    new_shape = np.array(zip(image.shape, mask.shape))
    new_shape[:,0] /= new_shape[:,1]
    reshaped_image = image.reshape(new_shape.ravel())

    reshaped_mask = mask[None,:,None,:,None,:]

    # and now they just broadcast:
    result_funny_shaped = reshaped_image >= reshaped_mask

    # And you can just reshape it back:
    return result_funny_shaped.reshape(image.shape)

And since timings are everything (not really but...):

In [172]: %timeit halftone_reshape(image, mask)
1 loops, best of 3: 280 ms per loop

In [173]: %timeit halftone_1(image, mask)
1 loops, best of 3: 354 ms per loop

In [174]: %timeit halftone_2(image, mask)
1 loops, best of 3: 3.1 s per loop
like image 97
seberg Avatar answered Sep 29 '22 02:09

seberg