Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying iterators in Python

When you make a (shallow) copy of an iterator, it returns a new iterator.

from copy import copy


data = [1, 2, 3, 4]

iter1 = iter(data)
iter2 = copy(iter1)

[i for i in iter1]    #[1, 2, 3, 4]
[i for i in iter2]    #[1, 2, 3, 4]

When you make a shallow copy of an iterator created with itertools, it returns the same iterator, but deepcopy returns a new one.

from copy import copy, deepcopy
from itertools import takewhile


data = [1, 2, 3, 4]

iter1 = takewhile(lambda x: x < 5, data)
iter2 = copy(iter1)
iter3 = deepcopy(iter1)

[i for i in iter1]    #[1, 2, 3, 4]
[i for i in iter2]    #[] because it's the same iterator
[i for i in iter3]    #[1, 2, 3, 4]

Is this the expected behavior? Nowhere in the documentation did I find any information about copying iterators. I know that there is an itertools.tee() function, but its usability is limited (e.g. when we iterate over a changing collection).

like image 228
skrzacik320 Avatar asked Nov 07 '22 12:11

skrzacik320


1 Answers

This is known behavior. An iterator is stateful: the current location is the state. So it makes sense that a shallow copy shares that state, as in the example you show.

A spec for "Copyable iterators" was actually discussed in a PEP way back in 2003.

They pointed out that "'support' for [separately iterably] copy.copy in a user-coded iterator type is almost invariably 'accidental'.... [T]he copy will be independently iterable with respect to the original only if" certain conditions exist in the implementation.

They decided not to adopt that PEP.

like image 89
Joshua Fox Avatar answered Nov 15 '22 05:11

Joshua Fox