It is well known that tuples are not defined by parentheses, but commas. Quote from documentation:
A tuple consists of a number of values separated by commas
Therefore:
myVar1 = 'a', 'b', 'c'
type(myVar1)
# Result:
<type 'tuple'>
Another striking example is this:
myVar2 = ('a')
type(myVar2)
# Result:
<type 'str'>
myVar3 = ('a',)
type(myVar3)
# Result:
<type 'tuple'>
Even the single-element tuple needs a comma, and parentheses are always used just to avoid confusion. My question is: Why can't we omit parentheses of arrays in a list comprehension? For example:
myList1 = ['a', 'b']
myList2 = ['c', 'd']
print([(v1,v2) for v1 in myList1 for v2 in myList2])
# Works, result:
[('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]
print([v1,v2 for v1 in myList1 for v2 in myList2])
# Does not work, result:
SyntaxError: invalid syntax
Isn't the second list comprehension just syntactic sugar for the following loop, which does work?
myTuples = []
for v1 in myList1:
for v2 in myList2:
myTuple = v1,v2
myTuples.append(myTuple)
print myTuples
# Result:
[('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]
Creating a TupleThe parentheses are optional, however, it is a good practice to use them. A tuple can have any number of items and they may be of different types (integer, float, list, string, etc.). A tuple can also be created without using parentheses.
There is no tuple comprehension in Python. Comprehension works by looping or iterating over items and assigning them into a container, a Tuple is unable to receive assignments.
We can perform comprehension to sequences such as lists, sets, dictionaries but not to tuples. For example, if we want to perform list comprehension, we use square brackets [ ].
Generator Expression. We learned that we use curly brackets for the set comprehension and square brackets for the list comprehension.
Python's grammar is LL(1), meaning that it only looks ahead one symbol when parsing.
[(v1, v2) for v1 in myList1 for v2 in myList2]
Here, the parser sees something like this.
[ # An opening bracket; must be some kind of list
[( # Okay, so a list containing some value in parentheses
[(v1
[(v1,
[(v1, v2
[(v1, v2)
[(v1, v2) for # Alright, list comprehension
However, without the parentheses, it has to make a decision earlier on.
[v1, v2 for v1 in myList1 for v2 in myList2]
[ # List-ish thing
[v1 # List containing a value; alright
[v1, # List containing at least two values
[v1, v2 # Here's the second value
[v1, v2 for # Wait, what?
A parser which backtracks tends to be notoriously slow, so LL(1) parsers do not backtrack. Thus, the ambiguous syntax is forbidden.
As I felt "because the grammar forbids it" to be a little too snarky, I came up with a reason.
It begins parsing the expression as a list/set/tuple and is expecting a ,
and instead encounters a for
token.
For example:
$ python3.6 test.py
File "test.py", line 1
[a, b for a, b in c]
^
SyntaxError: invalid syntax
tokenizes as follows:
$ python3.6 -m tokenize test.py
0,0-0,0: ENCODING 'utf-8'
1,0-1,1: OP '['
1,1-1,2: NAME 'a'
1,2-1,3: OP ','
1,4-1,5: NAME 'b'
1,6-1,9: NAME 'for'
1,10-1,11: NAME 'a'
1,11-1,12: OP ','
1,13-1,14: NAME 'b'
1,15-1,17: NAME 'in'
1,18-1,19: NAME 'c'
1,19-1,20: OP ']'
1,20-1,21: NEWLINE '\n'
2,0-2,0: ENDMARKER ''
There was no parser issue that motivated this restriction. Contrary to Silvio Mayolo's answer, an LL(1) parser could have parsed the no-parentheses syntax just fine. The parentheses were optional in early versions of the original list comprehension patch; they were only made mandatory to make the meaning clearer.
Quoting Guido van Rossum back in 2000, in a response to someone worried that [x, y for ...]
would cause parser issues,
Don't worry. Greg Ewing had no problem expressing this in Python's own grammar, which is about as restricted as parsers come. (It's LL(1), which is equivalent to pure recursive descent with one lookahead token, i.e. no backtracking.)
Here's Greg's grammar:
atom: ... | '[' [testlist [list_iter]] ']' | ... list_iter: list_for | list_if list_for: 'for' exprlist 'in' testlist [list_iter] list_if: 'if' test [list_iter]
Note that before, the list syntax was
'[' [testlist] ']'
. Let me explain it in different terms:The parser parses a series comma-separated expressions. Previously, it was expecting
']'
as the sole possible token following this. After the change,'for'
is another possible following token. This is no problem at all for any parser that knows how to parse matching parentheses!If you'd rather not support
[x, y for ...]
because it's ambiguous (to the human reader, not to the parser!), we can change the grammar to something like:'[' test [',' testlist | list_iter] ']'
(Note that
|
binds less than concatenation, and[...]
means an optional part.)
Also see the next response in the thread, where Greg Ewing runs
>>> seq = [1,2,3,4,5]
>>> [x, x*2 for x in seq]
[(1, 2), (2, 4), (3, 6), (4, 8), (5, 10)]
on an early version of the list comprehension patch, and it works just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With