I am having trouble understanding working of the zip()
function in python when an iterator is passed in instead of iterable.
Have a look at these two print statements:
string = "ABCDEFGHI"
print(list(zip(*[iter(string)] * 3)))
# Output: [('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'H', 'I')]
print(list(zip(*[string] * 3)))
# Output: [('A', 'A', 'A'), ('B', 'B', 'B'), ('C', 'C', 'C'), ('D', 'D', 'D'), ('E', 'E', 'E'), ('F', 'F', 'F'), ('G', 'G', 'G'), ('H', 'H', 'H'), ('I', 'I', 'I')]
Can someone explain me the working of zip() in both the cases?
The difference is that for [iter(string)] * 3
, zip
creates aliases of a single iterator. For [string] * 3
, zip
creates unique iterators per argument. The shorter output without duplicates is zip
exhausting the single aliased iterator.
See what is meaning of [iter(list)]*2 in python? for more details on how [iter(...)] * 2
works and causes potentially unexpected results.
See the canonical answer List of lists changes reflected across sublists unexpectedly if the [...] * 3
aliasing behavior is surprising.
Let's use a clearer example:
a = iter("123456") # One iterator
list(zip(a, a, a))
# [('1', '2', '3'), ('4', '5', '6')]
vs
a = iter("123456")
b = iter("123456")
c = iter("123456")
list(zip(a, b, c))
# [('1', '1', '1'), ('2', '2', '2'), ('3', '3', '3'), ('4', '4', '4'), ('5', '5', '5'), ('6', '6', '6')]
Obviously in the first example a
can only yield 6 elements, and has to yield 3 to zip
whenever zip
needs to create a value. In contrast, the second example has 18 elements total, and yields them in 6 groups of 3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With