What does the order parameter in numpy.array() do?
It says in the documentation I link to that it will specify the contiguous order of the array, but I got no idea what that is supposed to mean. So what is contiguous order?
Copy of the order parameter documentation:
order : {‘C’, ‘F’, ‘A’}, optional Specify the order of the array. If order is ‘C’ (default), then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’, then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous).
ascontiguousarray() function is used to return a contiguous array where the dimension of the array is greater or equal to 1 and stored in memory (C order). Note: A contiguous array is stored in an unbroken block of memory. To access the subsequent value in the array, we move to the next memory address.
Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending. The NumPy ndarray object has a function called sort() , that will sort a specified array.
A contiguous array is just an array stored in an unbroken block of memory: to access the next value in the array, we just move to the next memory address.
A) Array means contiguous memory. It can exist in any memory section be it Data or Stack or Heap.
Lets first unpack what K
A
C
and F
stand for first. I am referring to the implementation details section of this.
C
Is Contiguous layout. Mathematically speaking, row major.F
Is Fortran contiguous layout. Mathematically speaking, column major.A
Is any order. Generally don't use this.K
Is keep order. Generally don't use this.From here I can refer you to other answers that address the two following questions: Data Contiguity and Row vs. Column Major Ordering. Row vs Column Major Ordering is best described by its Wikipedia article. So now lets talk about data contiguity. In python this generally is not so important so I'm going to jump to C for a moment here.
In C there are two options for storing a 2D array.
In the first example, the type of data we are storing in our array is another array. In terms of pointers, we have a block of memory where each value in it is a pointer to another block of memory. In order to find a value at any point we must de-reference first the outer array and then the inner array.
In the second example, we have a single block of memory the size of rows * columns
. We can we can de-reference any index to get its value. But the indices are 1 dimensional. A 2D index can be converted using y + x * width
.
When doing numerical calculations, we strive to use contiguous arrays. The reason for this is cache acceleration, which numpy relies on. If I want to add the value a
to each value in a 2D array, I could move the entire flattened array into the cache if it fits. However, you could only move a single column (or row) into the cache for an array of arrays. If you want to know more, look up SIMD [Same instruction multiple data].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With