The following produces a C-contiguous numpy array:
import numpy
a = numpy.ones((1024,1024,5))
Now if I slice it, the result may not longer be the same. For example:
bn = a[:, :, n]
with n
from 0 to 4.
My problem is that I need bn
to be C-contiguous, and I need to do this for many instances of a. I just need each bn
once, and want to avoid doing
bn = bn.copy(order='C')
I also don't want to rewrite my code such that
a = numpy.ones((5,1024,1024))
Is there a faster, cheaper way to get bn
than doing the copy?
Background:
I want to hash each slice of every a
, using
import hashlib
hashlib.sha1(a[:, :, n]).hexdigest()
Unfortunately, this will throw a ValueError
, complaining about the order. So if there is another fast way to get the hash I want, I'd also use it.
ascontiguousarray() function is used to return a contiguous array where the dimension of the array is greater or equal to 1 and stored in memory (C order). Note: A contiguous array is stored in an unbroken block of memory. To access the subsequent value in the array, we move to the next memory address.
If order is 'C' (default), then the array will be in C-contiguous order (last-index varies the fastest). If order is 'F', then the returned array will be in Fortran-contiguous order (first-index varies the fastest).
Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
This is a standard operation when interfacing numpy with C. Have a look at numpy.ascontiguousarray
x=numpy.ascontiguousarray(x)
is the proper way of dealing with it.
Use numpy.asfortranarray if you need fortran order.
As mentioned the function will copy if necessary. So there is no way around it. You can try rollaxis before your operation, such that the short axis is the first axis. This gives you a view on the array
In [2]: A=np.random.rand(1024,1024,5)
In [3]: B=np.rollaxis(A,2)
In [4]: B.shape
Out[4]: (5, 1024, 1024)
In [5]: B.flags
Out[5]:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [6]: A.flags
Out[6]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
So rollaxis does not solve this either.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With