Numpy reshape on view

Tags:

I'm confused about the results of numpy reshape operated on a view. In the following q.flags shows that it does not own the data, but q.base is neither x nor y, so what is it? I'm surprised to see that q.strides is 8 which means that it gets the next element by every time move 8 bytes in memory (if I understand correctly). However if none of the arrays other than x owns data, the only data buffer is from x, which does not permit getting the next element of q by moving 8 bytes.

In [99]: x = np.random.rand(4, 4)

In [100]: y = x.T

In [101]: q = y.reshape(16)

In [102]: q.base is y
Out[102]: False

In [103]: q.base is x
Out[103]: False

In [104]: y.flags
Out[104]: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [105]: q.flags
Out[105]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [106]: q.strides
Out[106]: (8,)

In [107]: x
Out[107]: 
array([[ 0.62529694,  0.20813211,  0.73932923,  0.43183722],
       [ 0.09755023,  0.67082005,  0.78412615,  0.40307291],
       [ 0.2138691 ,  0.35191283,  0.57455781,  0.2449898 ],
       [ 0.36476299,  0.36590522,  0.24371933,  0.24837697]])

In [108]: q
Out[108]: 
array([ 0.62529694,  0.09755023,  0.2138691 ,  0.36476299,  0.20813211,
        0.67082005,  0.35191283,  0.36590522,  0.73932923,  0.78412615,
        0.57455781,  0.24371933,  0.43183722,  0.40307291,  0.2449898 ,
        0.24837697])

UPDATE:

It turns out that this question has been asked in the numpy discussion forum: http://numpy-discussion.10968.n7.nabble.com/OWNDATA-flag-and-reshape-views-vs-copies-td10363.html

507

asked Mar 05 '15 20:03

shaoyl85

2 Answers

In short: you cannot always rely on the ndarray.flags['OWNDATA'].

>>> import numpy as np
>>> x = np.random.rand(2,2)
>>> y = x.T
>>> q = y.reshape(4)
>>> y[0,0]
0.86751629121019136
>>> y[0,0] = 1
>>> q
array([ 0.86751629,  0.87671107,  0.65239976,  0.41761267])
>>> x
array([[ 1.        ,  0.65239976],
       [ 0.87671107,  0.41761267]])
>>> y
array([[ 1.        ,  0.87671107],
       [ 0.65239976,  0.41761267]])
>>> y.flags['OWNDATA']
False
>>> x.flags['OWNDATA']
True
>>> q.flags['OWNDATA']
False
>>> np.may_share_memory(x,y)
True
>>> np.may_share_memory(x,q)
False

Because q didn't reflect the change in the first element, like x or y, it must somehow be the owner of the data (somehow is explained below).

There is more discussion about the OWNDATA flag over at the numpy-discussion mailinglist. In the How can I tell if NumPy creates a view or a copy? question, it is briefly mentioned that simply checking the flags.owndata of an ndarray sometimes seems to fail and that it seems unreliable, as you mention. That's because every ndarray also has a base attribute:

the base of an ndarray is a reference to another array if the memory originated elsewhere (otherwise, the base is None). The operation y.reshape(4) creates a copy, not a view, because the strides of y are (8,16). To get it reshaped (C-contiguous) to (4,), the memory pointer would have to jump 0->16->8->24, which is not doable with a single stride. Thus q.base points to the memory location generated by the forced-copy-operation y.reshape, which has the same shape as y, but copied elements and thus has normal strides again: (16, 8). q.base is thus not bound to by any other name as it was the result of the forced-copy operation y.reshape(4). Only now can the object q.base be viewed in a (4,) shape, because the strides allow this. q is then indeed a view on q.base.

For most people it would be confusing to see that q.flags.owndata is False, because, as shown above, it is not a view on y. However, it is a view on a copy of y. That copy, q.base, is the owner of the data however. Thus the flags are actually correct, if you inspect closely.

156

answered Nov 15 '22 13:11

Oliver W.

I like to use .__array_interface__.

In [811]: x.__array_interface__
Out[811]: 
{'data': (149194496, False),
 'descr': [('', '<f8')],
 'shape': (4, 4),
 'strides': None,
 'typestr': '<f8',
 'version': 3}

In [813]: y.__array_interface__
Out[813]: 
{'data': (149194496, False),
 'descr': [('', '<f8')],
 'shape': (4, 4),
 'strides': (8, 32),
 'typestr': '<f8',
 'version': 3}

In [814]: x.strides
Out[814]: (32, 8)
In [815]: y.strides
Out[815]: (8, 32)

Transpose was performed by reversing the strides. The base data pointer is the same.

In [817]: q.__array_interface__
Out[817]: 
{'data': (165219304, False),
 'descr': [('', '<f8')],
 'shape': (16,),
 'strides': None,
 'typestr': '<f8',
 'version': 3}

So the q data is a copy (different pointer). Strides (8,) means its elements are accessed by stepping from one f8 to the next. But a x.reshape(16) is a view of x - because its data can be accessed with a simple 8 step.

To access the original data in the q order, it would have to step 32 bytes 3 times (down x rows), then go back to the start and step 8 to the 2nd x column, followed by 3 row steps, etc. Since striding doesn't work this way, it has to work from a copy.

Note also that y[0,0] changes x[0,0], but q[0] is independent of both.

While OWNDATA for q is false, it is True for y.ravel() and y.flatten(). I suspect reshape() in this case is making a copy, and then reshaping, and it's the intermediate copy that 'owns' the data, q.base.

answered Nov 15 '22 12:11

hpaulj

Related questions
                            
                                Clicking a Javascript link to make a post request in Python
                            
                                Pandas replace values in dataframe timeseries
                            
                                Where is the PyQt/PySide event-loop running?
                            
                                Installing matplotlib-venn
                            
                                Why is "from ... import *" in a function not allowed?
                            
                                tweepy error response status code 400
                            
                                Can't play HTML5 video using Flask
                            
                                Customize Error Message When Permission Check Fails
                            
                                Download part of the youtube video using python
                            
                                Extract Dates and events associated with the date from Text corpus
                            
                                Problems with upgrading pip in Homebrew Python 2.7 installation
                            
                                Python Selenium find element by link text contains a string with wildcard or regex
                            
                                Numpy.cumsum in reverse
                            
                                Hive transform using Python: Unable to initialize custom script
                            
                                Implementing Chain of responsibility pattern in python using coroutines
                            
                                How to read constituency based parse tree
                            
                                What's the best way of distinguishing bools from numbers in Python?
                            
                                difference between readlines() and split() [python]
                            
                                python: How to calculate the cosine similarity of two word lists?
                            
                                How to change the text of a span that acts like a button

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy reshape on view

Tags:

python

numpy

reshape

shaoyl85

People also ask

2 Answers

Oliver W.

hpaulj

Recent Activity

Donate For Us