Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is faster and why? Set or List?

Tags:

Lets say that I have a graph and want to see if b in N[a]. Which is the faster implementation and why?

a, b = range(2)
N = [set([b]), set([a,b])]

OR

N= [[b],[a,b]]

This is obviously oversimplified, but imagine that the graph becomes really dense.

like image 236
locoboy Avatar asked Oct 10 '11 18:10

locoboy


People also ask

Why is a set better than a list?

Because sets cannot have multiple occurrences of the same element, it makes sets highly useful to efficiently remove duplicate values from a list or tuple and to perform common math operations like unions and intersections.

Is set better than list?

Lists are slightly faster than sets when you just want to iterate over the values. Sets, however, are significantly faster than lists if you want to check if an item is contained within it. They can only contain unique items though.

How much faster are sets than lists Python?

As soon as you start testing membership of elements that are in the middle or end of the collection, sets perform 40% – 1800% better than lists or tuples You now have a fair idea why you should think of using sets for large collections…

Which is faster list or tuple or set?

Mutable, 2. Tuples are stored in a single block of memory. Tuples are immutable so, It doesn't require extra space to store new objects. Lists are allocated in two blocks: the fixed one with all the Python object information and a variable-sized block for the data. It is the reason creating a tuple is faster than List.


3 Answers

Membership testing in a set is vastly faster, especially for large sets. That is because the set uses a hash function to map to a bucket. Since Python implementations automatically resize that hash table, the speed can be constant (O(1)) no matter the size of the set (assuming the hash function is sufficiently good).

In contrast, to evaluate whether an object is a member of a list, Python has to compare every single member for equality, i.e. the test is O(n).

like image 98
phihag Avatar answered Dec 29 '22 00:12

phihag


Set ( I mean a hash based set like HashSet) is much faster than List to lookup for a value. List has to go sequentially to find out if the value exists. HashSet can directly jump and locate the bucket and look up for a value almost in a constant time.

like image 35
java_mouse Avatar answered Dec 28 '22 22:12

java_mouse


It all depends on what you're trying to accomplish. Using your example verbatim, it's faster to use lists, as you don't have to go through the overhead of creating the sets:

import timeit

def use_sets(a, b):
    return [set([b]), set([a, b])]

def use_lists(a, b):
    return [[b], [a, b]]

t=timeit.Timer("use_sets(a, b)", """from __main__ import use_sets
a, b = range(2)""")
print "use_sets()", t.timeit(number=1000000)

t=timeit.Timer("use_lists(a, b)", """from __main__ import use_lists
a, b = range(2)""")
print "use_lists()", t.timeit(number=1000000)

Produces:

use_sets() 1.57522511482
use_lists() 0.783344984055

However, for reasons already mentioned here, you benefit from using sets when you are searching large sets. It's impossible to tell by your example where that inflection point is for you and whether or not you'll see the benefit.

I suggest you test it both ways and go with whatever is faster for your specific use-case.

like image 35
Austin Marshall Avatar answered Dec 28 '22 23:12

Austin Marshall