Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The order of nested list comprehension and nested generator expression in python

I'm new to Python and is confused by a piece of code in Python's official documentation.

unique_words = set(word  for line in page  for word in line.split())

To me, it looks equivalent to:

unique_words=set()
for word in line.split():
    for line in page:
        unique_words.add(word)

How can line be used in the first loop before it's defined in the nested loop? However, it actually works. I think it suggests the order of nested list comprehension and generator expression is from left to right, which contradicts with my previous understanding.

Can anyone clarify the correct order for me?

like image 845
Loopz Avatar asked Nov 05 '14 14:11

Loopz


3 Answers

word for line in page for word in line.split()

this part works like this:-

for line in page:
    for word in line.split():
        print word

() this makes it `generator function hence overall statement work lie this:-

def solve():
    for line in page:
        for word in line.split():
            yield word

and set() is used to avoid duplicacy or repetition of same word as the code is meant to get 'unique words'.

like image 73
Vishnu Upadhyay Avatar answered Oct 27 '22 18:10

Vishnu Upadhyay


You got the loops wrong. Use this:

unique_words = set(word for line in page for word in line.split())
print unique_words

l = []
for line in page:
    for word in line.split():
        l.append(word)
print set(l)

output:

C:\...>python test.py
set(['sdaf', 'sadfa', 'sfsf', 'fsdf', 'fa', 'sdf', 'asd', 'asdf'])
set(['sdaf', 'sadfa', 'sfsf', 'fsdf', 'fa', 'sdf', 'asd', 'asdf'])
like image 24
Vincent Beltman Avatar answered Oct 27 '22 19:10

Vincent Beltman


From the tutorial in the official documentation:

A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it. For example, this listcomp combines the elements of two lists if they are not equal:
>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
and it’s equivalent to:
>>> combs = []
>>> for x in [1,2,3]:
...     for y in [3,1,4]:
...         if x != y:
...             combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
Note how the order of the for and if statements is the same in both these snippets.

See the last sentence quoted above.

Also note that the construct you're describing is not (officially) called a "nested list comprehension". A nested list comprehension entails a list comprehension which is within another list comprehension, such as (again from the tutorial):

[[row[i] for row in matrix] for i in range(4)]

The thing you're asking about is simply a list comprehension with multiple for clauses.

like image 1
John Y Avatar answered Oct 27 '22 19:10

John Y