Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cheapest way to get a numpy array into C-contiguous order?

Tags:

The following produces a C-contiguous numpy array:

import numpy

a = numpy.ones((1024,1024,5))

Now if I slice it, the result may not longer be the same. For example:

bn = a[:, :, n]

with n from 0 to 4. My problem is that I need bn to be C-contiguous, and I need to do this for many instances of a. I just need each bn once, and want to avoid doing

bn  = bn.copy(order='C')

I also don't want to rewrite my code such that

a = numpy.ones((5,1024,1024))

Is there a faster, cheaper way to get bn than doing the copy?

Background:

I want to hash each slice of every a, using

import hashlib

hashlib.sha1(a[:, :, n]).hexdigest()

Unfortunately, this will throw a ValueError, complaining about the order. So if there is another fast way to get the hash I want, I'd also use it.

like image 566
Daniel Sk Avatar asked Apr 29 '15 15:04

Daniel Sk


People also ask

Is NumPy array contiguous?

ascontiguousarray() function is used to return a contiguous array where the dimension of the array is greater or equal to 1 and stored in memory (C order). Note: A contiguous array is stored in an unbroken block of memory. To access the subsequent value in the array, we move to the next memory address.

What is NumPy order C?

If order is 'C' (default), then the array will be in C-contiguous order (last-index varies the fastest). If order is 'F', then the returned array will be in Fortran-contiguous order (first-index varies the fastest).

Is NumPy array access faster than list?

Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster. So overall a task executed in Numpy is around 5 to 100 times faster than the standard python list, which is a significant leap in terms of speed.

Why NumPy is faster than for loop?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


1 Answers

This is a standard operation when interfacing numpy with C. Have a look at numpy.ascontiguousarray

x=numpy.ascontiguousarray(x)

is the proper way of dealing with it.

Use numpy.asfortranarray if you need fortran order.

As mentioned the function will copy if necessary. So there is no way around it. You can try rollaxis before your operation, such that the short axis is the first axis. This gives you a view on the array

In [2]: A=np.random.rand(1024,1024,5)
In [3]: B=np.rollaxis(A,2)
In [4]: B.shape
Out[4]: (5, 1024, 1024)
In [5]: B.flags
Out[5]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [6]: A.flags
Out[6]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

So rollaxis does not solve this either.

like image 147
Bort Avatar answered Sep 27 '22 21:09

Bort