Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate generators in python 3 with +-operator

I was wondering why the + operator is not implemented for generators. Here Concatenate generator and item the solution of itertools.chain is suggested, nevertheless I would think the + syntax is equally readable as it is for concatenating lists.

gen1 = (x for x in [1,2,3])
gen2 = (x for x in [4,5,6])

# Works:
from itertools import chain
print(' '.join(map(str, chain(gen1, gen2))))

# TypeError: unsupported operand type(s) for +: 'generator' and 'generator'
print(' '.join(map(str, gen1 + gen2)))

Is there a (philosophical) reason why + is not available for generators? I think it would make code so much more readable that interator.chain(...). Is there anything ambiguous about gen1+gen2?

like image 529
Herbert Avatar asked Dec 19 '22 01:12

Herbert


2 Answers

From the Python mailing list:

https://mail.python.org/pipermail/python-ideas/2010-April/006982.html

On Sun, Apr 4, 2010 at 9:58 AM, cool-RR wrote:

I'm still a bit confused by generators and generator expressions, but what about making a generator.__add__ method, that'll act like itertools.chain?

This gets proposed several times a year. We always shoot it down for the same reasons: iterators are an abstract API and we don't want to burden every iterator implementation with having to implement various operations.

-- --Guido van Rossum (python.org/~guido)

like image 156
Alex Riley Avatar answered Dec 28 '22 06:12

Alex Riley


With the current implementation of __add__ for numbers, strings and sequences, the method returns a new object independent of the operands. ie.

x = [1]
y = [2]
z = x + y
x.append(1)
assert z == [1, 2] # z is not effected by the mutation of x

However, generators are both mutable (repeated calls to next do not necessarily return the same value), and more important lazy. That is, the generator object returned by the __add__ is still intrinsically linked to the operands. eg.

x = iter(range(3))
y = iter(range(3, 6))
z = x + y
a = list(x)
b = list(z) # what should the contents of z be?

By the convention of the current semantics of __add__ the contents should be [0, 1, 2, 3, 4, 5]. However, because the generators are not independent the result will be [3, 4, 5]. Code example follows:

import itertools

class ExtendedGenerator:

    def __init__(self, gen):
        self.gen = iter(gen)

    def __iter__(self):
        return self

    def __next__(self):
        return next(self.gen)

    def __add__(self, other):
        try:
            return ExtendedGenerator(itertools.chain(self.gen, iter(other)))
        except TypeError:
            return NotImplemented

    def __radd__(self, other):
        try:
            return ExtendedGenerator(itertools.chain(iter(other), self.gen))
        except TypeError:
            return NotImplemented

x = ExtendedGenerator(range(3))
y = range(3, 6)
z = x + y
a = list(x)
b = list(z) # what should the contents of z be?

print("a:", a)
print("b:", b)

Which prints:

a: [0, 1, 2]
b: [3, 4, 5]
like image 33
Dunes Avatar answered Dec 28 '22 05:12

Dunes