In the following example:
>>> import numpy as np
>>> a = np.arange(10)
>>> b = a[:,np.newaxis]
>>> c = b.ravel()
>>> np.may_share_memory(a,c)
False
Why is numpy.ravel
returning a copy of my array? Shouldn't it just be returning a
?
Edit:
I just discovered that np.squeeze
doesn't return a copy.
>>> b = a[:,np.newaxis]
>>> c = b.squeeze()
>>> np.may_share_memory(a,c)
True
Why is there a difference between squeeze
and ravel
in this case?
Edit:
As pointed out by mgilson, newaxis
marks the array as discontiguous, which is why ravel
is returning a copy.
So, the new question is why is newaxis
marking the array as discontiguous.
The story gets even weirder though:
>>> a = np.arange(10)
>>> b = np.expand_dims(a,axis=1)
>>> b.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> c = b.ravel()
>>> np.may_share_memory(a,c)
True
According to the documentation for expand_dims
, it should be equivalent to newaxis
.
The numpy module of Python provides a function called numpy. ravel, which is used to change a 2-dimensional array or a multi-dimensional array into a contiguous flattened array. The returned array has the same data type as the source array or input array.
Python's ravel() function is used to return a contiguous array. This function returns a 1D array that contains the input elements.
Ravel is faster than flatten() as it does not occupy any memory. Flatten() is comparatively slower than ravel() as it occupies memory. Ravel is a library-level function. Flatten is a method of an ndarray object.
ravel() function returns the flattened underlying data as an ndarray. Syntax: Series.ravel(order='C') Parameter : order. Returns : ndarray.
This may not be the best answer to your question, but it looks like inserting a newaxis causes numpy to view the array as non-contiguous -- probably for broadcasting purposes:
>>> a=np.arange(10)
>>> b=a[:,None]
>>> a.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> b.flags
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
However, a reshape will not cause that:
>>> c=a.reshape(10,1)
>>> c.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
And those arrays do share the same memory:
>>> np.may_share_memory(c.ravel(),a)
True
EDIT
np.expand_dims
is actually implemented using reshape
which is why it works (This is a slight error in documentation I suppose). Here's the source (without the docstring):
def expand_dims(a,axis):
a = asarray(a)
shape = a.shape
if axis < 0:
axis = axis + len(shape) + 1
return a.reshape(shape[:axis] + (1,) + shape[axis:])
It looks like it may have to do with the strides:
>>> c = np.expand_dims(a, axis=1)
>>> c.strides
(8, 8)
>>> b = a[:, None]
>>> b.strides
(8, 0)
>>> b.flags
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
>>> b.strides = (8, 8)
>>> b.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
I'm not sure what difference the stride on dimension 1 could make here, but it looks like that's what's making numpy treat the array as not contiguous.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With