Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python ravel vs. transpose when used in reshape

Tags:

I have a 2D array v, v.shape=(M_1,M_2), which I want to reshape into a 3D array with v.shape=(M_2,N_1,N_2), and M_1=N_1*N_2.

I came up with the following lines which produce the same result:

np.reshape(v.T, reshape_tuple)

and

np.reshape(v.ravel(order='F'), reshape_tuple)

for reshape_tuple=(M_2,N_1,N_2).

Which one is computationally better and in what sense (comp time, memory, etc.) if the original v is a huge (possibly complex-valued) matrix?

My guess would be that using the transpose is better, but if reshape does an automatic ravel then maybe the ravel-option is faster (though reshape might be doing the ravel in C or Fortran and then it's not clear)?

like image 202
python_freak Avatar asked Jul 29 '16 11:07

python_freak


1 Answers

The order in which they do things - reshape, change strides, and make a copy - differs, but they end up doing the same thing.

I like to use __array_interface__ to see where the data buffer is located, and other changes. I suppose I should add the flags to see the order. But we/you know that transpose changes the order to to F already, right?

In [549]: x=np.arange(6).reshape(2,3)
In [550]: x.__array_interface__
Out[550]: 
{'data': (187732024, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

transpose is a view, with different shape, strides and order:

In [551]: x.T.__array_interface__
Out[551]: 
{'data': (187732024, False),
 'descr': [('', '<i4')],
 'shape': (3, 2),
 'strides': (4, 12),
 'typestr': '<i4',
 'version': 3}

ravel with different order is a copy (different data buffer pointer)

In [552]: x.ravel(order='F').__array_interface__
Out[552]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (6,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

transpose ravel is also a copy. I think the same data pointer is just a case of memory reuse (since I'm not assigning to a variable) - but that can be checked.

In [553]: x.T.ravel().__array_interface__
Out[553]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (6,),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

add the reshape:

In [554]: x.T.ravel().reshape(2,3).__array_interface__
Out[554]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}
In [555]: x.ravel(order='F').reshape(2,3).__array_interface__
Out[555]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

I think there's an implicit 'ravel' in reshape:

In [558]: x.T.reshape(2,3).__array_interface__
Out[558]: 
{'data': (182286992, False),
 'descr': [('', '<i4')],
 'shape': (2, 3),
 'strides': None,
 'typestr': '<i4',
 'version': 3}

(I should rework these examples to get rid of that memory reuse ambiguity.) In any case, reshape after transpose requires the same memory copy that a ravel with order change does. And as far as I can tell only one copy is required for either case. The other operations just involve changes to attributes like shape.

May be it's clearer if we just look at the arrays

In [565]: x.T
Out[565]: 
array([[0, 3],
       [1, 4],
       [2, 5]])

In the T we can still step through the array in numeric order. But after reshape, the 1 isn't anywhere close to the 0. Clearly there's been a copy.

In [566]: x.T.reshape(2,3)
Out[566]: 
array([[0, 3, 1],
       [4, 2, 5]])

the order of values after the ravel looks similar, and more obviously so after reshape.

In [567]: x.ravel(order='F')
Out[567]: array([0, 3, 1, 4, 2, 5])
In [568]: x.ravel(order='F').reshape(2,3)
Out[568]: 
array([[0, 3, 1],
       [4, 2, 5]])
like image 116
hpaulj Avatar answered Sep 28 '22 02:09

hpaulj