When you make a (shallow) copy of an iterator, it returns a new iterator.
from copy import copy
data = [1, 2, 3, 4]
iter1 = iter(data)
iter2 = copy(iter1)
[i for i in iter1] #[1, 2, 3, 4]
[i for i in iter2] #[1, 2, 3, 4]
When you make a shallow copy of an iterator created with itertools, it returns the same iterator, but deepcopy returns a new one.
from copy import copy, deepcopy
from itertools import takewhile
data = [1, 2, 3, 4]
iter1 = takewhile(lambda x: x < 5, data)
iter2 = copy(iter1)
iter3 = deepcopy(iter1)
[i for i in iter1] #[1, 2, 3, 4]
[i for i in iter2] #[] because it's the same iterator
[i for i in iter3] #[1, 2, 3, 4]
Is this the expected behavior? Nowhere in the documentation did I find any information about copying iterators. I know that there is an itertools.tee() function, but its usability is limited (e.g. when we iterate over a changing collection).
This is known behavior. An iterator is stateful: the current location is the state. So it makes sense that a shallow copy shares that state, as in the example you show.
A spec for "Copyable iterators" was actually discussed in a PEP way back in 2003.
They pointed out that "'support' for [separately iterably] copy.copy
in a user-coded iterator type is almost invariably 'accidental'.... [T]he copy will be independently iterable with respect to the original only if" certain conditions exist in the implementation.
They decided not to adopt that PEP.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With