Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do python Set Comprehensions work?

Q1 - Is the following a set() of a generator expression or a set comprehension? (Or are they same? If so, are list & dict comprehensions also corresponding type-cast on generators?)

my_set = {x for x in range(10)}

Q2 - Does the evaluation consider duplicate values & then remove them by applying set()?

dup_set = {x for x in [0, 1, 2, 0, 1, 2]}

Does the comprehension perform (speed-wise) better than regular for loops?

Update - I tried using timeit for speed comparisons. Am not sure if I am being just (fair) about it.

C:\>python -m timeit "s = set()" "for x in range(10):" "
  s.add(x)"
100000 loops, best of 3: 2.3 usec per loop

C:\>python -m timeit "s = {x for x in range(10)}"
1000000 loops, best of 3: 1.68 usec per loop

Now, using some conditionals

C:\>python -m timeit "s = set()" "for x in range(10):" "
  if x%2: s.add(x)"
100000 loops, best of 3: 2.27 usec per loop

C:\>python -m timeit "s = {x for x in range(10) if x%2}"
1000000 loops, best of 3: 1.83 usec per loop

So, there is quite some difference, is it due to the functionality being hardcoded in c?

like image 756
shad0w_wa1k3r Avatar asked Dec 10 '13 14:12

shad0w_wa1k3r


People also ask

How do list comprehensions work in Python?

List comprehensions provide us with a simple way to create a list based on some sequence or another list that we can loop over. In python terminology, anything that we can loop over is called iterable. At its most basic level, list comprehension is a syntactic construct for creating lists from existing lists.

Is set comprehension possible in Python?

The last comprehension we can use in Python is called Set Comprehension. Set comprehensions are similar to list comprehensions but return a set instead of a list. The syntax looks more like Dictionary Comprehension in the sense that we use curly brackets to create a set.

How do build a set using a set comprehension?

A set comprehension carries the following steps: First, iterate over the elements of a set. Second, apply an expression to each element. Third, create a new set of elements resulting from the expression.

How do you define set comprehension?

Set comprehension is a mathematical notation for defining sets on the basis of a property of their members. Although set comprehension is widely used in mathematics and some programming languages, direct support for reasoning about it is still not readily available in state-of-the-art SMT solvers.


1 Answers

Q1 : Yes, yes, yes and yes. Or at least they behave like this. It's a little bit different if you're taking a look at the bytecode. Let's disassembly this code (Python 2.7) :

def list_comp(l):
    return [x+1 for x in l]

def dict_comp(l):
    return {x+1:0 for x in l}

def set_comp(l):
    return {x+1 for x in l}

def generator(l):
    return (x+1 for x in l)

This is what you get:

Disassembly of list_comp:
  2           0 BUILD_LIST              0
              3 LOAD_FAST               0 (l)
              6 GET_ITER            
        >>    7 FOR_ITER               16 (to 26)
             10 STORE_FAST              1 (x)
             13 LOAD_FAST               1 (x)
             16 LOAD_CONST              1 (1)
             19 BINARY_ADD          
             20 LIST_APPEND             2
             23 JUMP_ABSOLUTE           7
        >>   26 RETURN_VALUE
Disassembly of dict_comp:
  5           0 LOAD_CONST              1 (<code object <dictcomp> at 029DEE30)
              3 MAKE_FUNCTION           0
              6 LOAD_FAST               0 (l)
              9 GET_ITER            
             10 CALL_FUNCTION           1
             13 RETURN_VALUE  
Disassembly of set_comp:
  8           0 LOAD_CONST              1 (<code object <setcomp> at 029DECC8)
              3 MAKE_FUNCTION           0
              6 LOAD_FAST               0 (l)
              9 GET_ITER            
             10 CALL_FUNCTION           1
             13 RETURN_VALUE  
Disassembly of generator:
 11           0 LOAD_CONST              1 (<code object <genexpr> at 02A8FD58)
              3 MAKE_FUNCTION           0
              6 LOAD_FAST               0 (l)
              9 GET_ITER            
             10 CALL_FUNCTION           1
             13 RETURN_VALUE                     

The bytecode is barely the same for the dict comprenhension, the set comprehension and the generator. They all load a code object (<dictcomp>, <setcomp> or <genexpr>) and then make a callable function out of it. The list comprehension is different because it generates the bytecode corresponding to your list comprehension. This time it is interpreted and thus not native.

Q2 : It doesn't really consider duplicate values since it creates a comprehension with the list you gave. And then it creates the set with the comprehension.

About timing : List/Dict/Set comprehensions tend to be faster than anything else. Even if they're interpreted, the bytecode generated is optimized for most of the cases with special bytecode instructions like SET_ADD, LIST_APPEND or MAP_ADD.

like image 79
Vincent Avatar answered Oct 06 '22 00:10

Vincent