Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy : How to convert an array type quickly

Tags:

python

numpy

I find the astype() method of numpy arrays not very efficient. I have an array containing 3 million of Uint8 point. Multiplying it by a 3x3 matrix takes 2 second, but converting the result from uint16 to uint8 takes another second.

More precisely :

    print time.clock()
    imgarray = np.dot(imgarray,  M)/255
    print time.clock()
    imgarray = imgarray.clip(0, 255)
    print time.clock()
    imgarray = imgarray.astype('B')
    print time.clock()

dot product and scaling takes 2 sec
clipping takes 200 msec type conversion takes 1 sec

Given the time taken by the other operations, I would expect astype to be faster. Is there a faster way to do type conversion, or am I wrong when guesstimating that type conversion should not be that hard ?

Edit : the goal is to save the final 8 bit array to a file

like image 228
shodanex Avatar asked Dec 11 '09 15:12

shodanex


People also ask

How do I change the Dtype of a NumPy array?

We have a method called astype(data_type) to change the data type of a numpy array. If we have a numpy array of type float64, then we can change it to int32 by giving the data type to the astype() method of numpy array. We can check the type of numpy array using the dtype class.

How do you change Dtype?

In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type. The function supports all the generic types and built-in types of data.

What is Astype NumPy?

To modify the data type of a NumPy array, use the astype(data type) method. It is a popular function in Python used to modify the dtype of the NumPy array we've been provided with. We'll use the numpy. astype() function to modify the dtype of the specified array object.

How can I make NumPy faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.


1 Answers

When you use imgarray = imgarray.astype('B'), you get a copy of the array, cast to the specified type. This requires extra memory allocation, even though you immediately flip imgarray to point to the newly allocated array.

If you use imgarray.view('uint8'), then you get a view of the array. This uses the same data except that it is interpreted as uint8 instead of imgarray.dtype. (np.dot returns a uint32 array, so after the np.dot, imgarray is of type uint32.)

The problem with using view, however, is that a 32-bit integer becomes viewed as 4 8-bit integers, and we only care about the value in the last 8-bits. So we need to skip to every 4th 8-bit integer. We can do that with slicing:

imgarray.view('uint8')[:,::4]

IPython's %timeit command shows there is a significant speed up doing things this way:

In [37]: %timeit imgarray2 = imgarray.astype('B')
10000 loops, best of 3: 107 us per loop

In [39]: %timeit imgarray3 = imgarray.view('B')[:,::4]
100000 loops, best of 3: 3.64 us per loop
like image 138
unutbu Avatar answered Sep 17 '22 12:09

unutbu