Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

when array.array is more efficient than lists?

I was reading the book 'Fluent Python' when I encountered a sentence where the author states that

If you need to store 10 million floating-point values an array is much more efficient, because an array does not actually hold full fledged objects, but only the packed bytes representing their machine values - just like array in C language.

I am not able to understand what the author is trying to convey. What is he saying about 'packed bytes' ? what does 'packed bytes storing' mean ? . How is python lists storing it ? why is it not storing it that way if that is what is making it efficient ?

like image 836
Harish Kayarohanam Avatar asked Dec 23 '16 20:12

Harish Kayarohanam


People also ask

Why are arrays more efficient than lists?

Arrays can store data very compactly and are more efficient for storing large amounts of data. Arrays are great for numerical operations; lists cannot directly handle math operations. For example, you can divide each element of an array by the same number with just one line of code.

Which is more efficient array or list?

The array is faster in case of access to an element while List is faster in case of adding/deleting an element from the collection.

Are array lists more efficient than arrays?

An array is faster and that is because ArrayList uses a fixed amount of array. However when you add an element to the ArrayList and it overflows. It creates a new Array and copies every element from the old one to the new one.

Why are arrays more efficient than lists Python?

Array can only be used for specific types, whereas lists can be used for any object. Arrays can also only data of one type, whereas a list can have entries of various object types. Arrays are also more efficient for some numerical computation.


1 Answers

Let's say you're dealing with 8-byte floating-point numbers. "Packed bytes" in this context means that there's a dedicated chunk of allocated memory in which the first 8 bytes represent the first float, and then immediately the next 8 bytes represent the next float, and so on with no wastage. It's the most space-efficient way there is of storing the data (at least, without compression). It may also be the most time-efficient for certain operations (for example, arraywise arithmetic operations).

A Python list doesn't store things that way. For one thing, one list element could be a float but the next one might be some other type of object. For another thing, you can remove, insert or replace items in a list. Some of these operations involve lengthening or shortening the list dynamically. All are very time- and memory- inefficient if items are stored as packed bytes. The Python list class is designed to be as general-purpose as possible, making compromises between the efficiency of various types of operations.

Probably the most important difference is that a Python list, in its underlying C implementation, is a container full of pointers to objects, rather than a container full of raw object content. One implication of this is that multiple references to the same Python object can appear in a list. Another is that changing a particular item can be done very efficiently. For example, let's say the first item in your list, a[0], is an integer, but you want to replace it with a string that takes up more memory, e.g. a[0] = "There's a horse in aisle five." A packed array would have to (a) make extra room, shifting all of the rest of the array content in memory and (b) separately update some sort of index of item sizes and types. But a Python list would only have to overwrite one pointer value with another.

In the CPython implementation, the pointers themselves may still be (more or less) packed in memory. This means that inserting a new item into a list will usually still be inefficient (relative to the way it would be if the Python list implementation used, say, a link-list structure under the hood).

In general, there's no absolute "efficient" or "inefficient"—it's all a question of which resource you're being efficient with, what (restrictions on) content types there are in the container, and how you are intending to transform the container or its contents.

like image 165
jez Avatar answered Oct 23 '22 13:10

jez