I'm having trouble understanding how <code>numpy</code> stores its data. Consider the following: <pre class="prettyprint"><code>>>> import numpy as np >>> a = np.ndarray(shape=(2,3), order='F') >>> for i in xrange(6): a.itemset(i, i+1) ... >>> a array([[ 1., 2., 3.], [ 4., 5., 6.]]) >>> a.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False </code></pre> This says that <code>a</code> is column major (<code>F_CONTIGUOUS</code>) thus, internally, <code>a</code> should look like the following: <pre class="prettyprint"><code>[1, 4, 2, 5, 3, 6] </code></pre> This is just what it is stated in in this glossary. What is confusing me is that if I try to to access the data of <code>a</code> in a linear fashion instead I get: <pre class="prettyprint"><code>>>> for i in xrange(6): print a.item(i) ... 1.0 2.0 3.0 4.0 5.0 6.0 </code></pre> At this point I'm not sure what the <code>F_CONTIGUOUS</code> flag tells us since it does not honor the ordering. Apparently everything in python is row major and when we want to iterate in a linear fashion we can use the iterator <code>flat</code>. The question is the following: given that we have a list of numbers, say: <code>1, 2, 3, 4, 5, 6</code>, how can we create a <code>numpy</code> array of shape <code>(2, 3)</code> in column major order? That is how can I get a matrix that looks like this <pre class="prettyprint"><code>array([[ 1., 3., 5.], [ 2., 4., 6.]]) </code></pre> I would really like to be able to iterate linearly over the list and place them into the newly created <code>ndarray</code>. The reason for this is because I will be reading files of multidimensional arrays set in column major order.

The numpy stores data in row major order. <pre class="prettyprint"><code>>>> a = np.array([[1,2,3,4], [5,6,7,8]]) >>> a.shape (2, 4) >>> a.shape = 4,2 >>> a array([[1, 2], [3, 4], [5, 6], [7, 8]]) </code></pre> If you change the shape, the order of data do not change. If you add a 'F', you can get what you want. <pre class="prettyprint"><code>>>> b array([1, 2, 3, 4, 5, 6]) >>> c = b.reshape(2,3,order='F') >>> c array([[1, 3, 5], [2, 4, 6]]) </code></pre>

numpy array row major and column major

Tags:

python

arrays

numpy

I'm having trouble understanding how numpy stores its data. Consider the following:

>>> import numpy as np
>>> a = np.ndarray(shape=(2,3), order='F')
>>> for i in xrange(6): a.itemset(i, i+1)
... 
>>> a
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])
>>> a.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

This says that a is column major (F_CONTIGUOUS) thus, internally, a should look like the following:

[1, 4, 2, 5, 3, 6]

This is just what it is stated in in this glossary. What is confusing me is that if I try to to access the data of a in a linear fashion instead I get:

>>> for i in xrange(6): print a.item(i)
... 
1.0
2.0
3.0
4.0
5.0
6.0

At this point I'm not sure what the F_CONTIGUOUS flag tells us since it does not honor the ordering. Apparently everything in python is row major and when we want to iterate in a linear fashion we can use the iterator flat.

The question is the following: given that we have a list of numbers, say: 1, 2, 3, 4, 5, 6, how can we create a numpy array of shape (2, 3) in column major order? That is how can I get a matrix that looks like this

array([[ 1.,  3.,  5.],
       [ 2.,  4.,  6.]])

I would really like to be able to iterate linearly over the list and place them into the newly created ndarray. The reason for this is because I will be reading files of multidimensional arrays set in column major order.

757

asked Dec 03 '13 02:12

jmlopez

2 Answers

The numpy stores data in row major order.

>>> a = np.array([[1,2,3,4], [5,6,7,8]])
>>> a.shape
(2, 4)
>>> a.shape = 4,2
>>> a
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

If you change the shape, the order of data do not change.

If you add a 'F', you can get what you want.

>>> b
array([1, 2, 3, 4, 5, 6])
>>> c = b.reshape(2,3,order='F')
>>> c
array([[1, 3, 5],
       [2, 4, 6]])

answered Oct 07 '22 08:10

Kill Console

Your question has been answered, but I thought I would add this to explain your observations regarding, "At this point I'm not sure what the F_CONTIGUOUS flag tells us since it does not honor the ordering."

The item method doesn't directly access the data like you think it does. To do this, you should access the data attribute, which gives you the byte string.

An example:

c = np.array([[1,2,3],
              [4,6,7]], order='C')

f = np.array([[1,2,3],
              [4,6,7]], order='F')

Observe

print c.flags.c_contiguous, f.flags.f_contiguous
# True, True

and

print c.nbytes == len(c.data)
# True

Now let's print the contiguous data for both:

nelements = np.prod(c.shape)
bsize = c.dtype.itemsize # should be 8 bytes for 'int64'
for i in range(nelements):
    bnum = c.data[i*bsize : (i+1)*bsize] # The element as a byte string.
    print np.fromstring(bnum, dtype=c.dtype)[0], # Convert to number.

This prints:

1 2 3 4 6 7

which is what we expect since c is order 'C', i.e., its data is stored row-major contiguous.

On the other hand,

nelements = np.prod(f.shape)
bsize = f.dtype.itemsize # should be 8 bytes for 'int64'
for i in range(nelements):
    bnum = f.data[i*bsize : (i+1)*bsize] # The element as a byte string.
    print np.fromstring(bnum, dtype=f.dtype)[0], # Convert to number.

prints

1 4 2 6 3 7

which, again, is what we expect to see since f's data is stored column-major contiguous.

answered Oct 07 '22 07:10

Matt Hancock

Related questions
                            
                                Django - present current date and time in template
                            
                                Py2exe for Python 3.0
                            
                                Python - Working around memory leaks
                            
                                Efficient and fast Python While loop while using sleep()
                            
                                Performance with global variables vs local
                            
                                New style formatting with tuple as argument
                            
                                How can unrar a file with python
                            
                                react routing and django url conflict
                            
                                HTTP requests.post timeout
                            
                                Standard python interpreter has a vi command mode?
                            
                                Numbers passed as command line arguments in python not interpreted as integers
                            
                                eval to import a module
                            
                                Python Pandas does not read the first row of csv file
                            
                                pandas combine two strings ignore nan values
                            
                                Pandas - Case when & default in pandas
                            
                                How do I find the Windows common application data folder using Python?
                            
                                python's sum() and non-integer values
                            
                                Get Queue Size in Pika (AMQP Python)
                            
                                ndimage missing from scipy
                            
                                Upload files using SFTP in Python, but create directories if path doesn't exist

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With