Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adapt an iterator to behave like a file-like object in Python

Tags:

python

I have a generator producing a list of strings. Is there a utility/adapter in Python that could make it look like a file?

For example,

>>> def str_fn(): ...     for c in 'a', 'b', 'c': ...         yield c * 3 ...  >>> for s in str_fn(): ...     print s ...  aaa bbb ccc >>> stream = some_magic_adaptor(str_fn()) >>> while True: ...    data = stream.read(4) ...    if not data: ...        break ...    print data aaab bbcc c 

Because data may be big and needs to be streamable (each fragment is a few kilobytes, the entire stream is tens of megabytes), I do not want to eagerly evaluate the whole generator before passing it to stream adaptor.

like image 888
Alex B Avatar asked Sep 26 '12 02:09

Alex B


People also ask

Is iterator an object in Python?

An iterator in Python is an object that contains a countable number of elements that can be iterated upon. In simpler words, we can say that Iterators are objects that allow you to traverse through all the elements of a collection and return one element at a time.

What does __ ITER __ do in Python?

The __iter__() function returns an iterator for the given object (array, set, tuple, etc. or custom objects). It creates an object that can be accessed one element at a time using __next__() function, which generally comes in handy when dealing with loops.

What is the advantage of iterator in python?

Iterators will be faster and have better memory efficiency. Just think of an example of range(1000) vs xrange(1000) . (This has been changed in 3.0, range is now an iterator.) With range you pre-build your list, but xrange is an iterator and yields the next item when needed instead.

How can you tell if an object is an iterator?

An object is Iterable if it can give you Iterator . It does so when you use iter() on it. An object is Iterator if you can use next() to sequentially browse through its elements. For example, map() returns Iterator and list is Iterable .


1 Answers

The "correct" way to do this is inherit from a standard Python io abstract base class. However it doesn't appear that Python allows you to provide a raw text class, and wrap this with a buffered reader of any kind.

The best class to inherit from is TextIOBase. Here's such an implementation, handling readline, and read while being mindful of performance. (gist)

import io  class StringIteratorIO(io.TextIOBase):      def __init__(self, iter):         self._iter = iter         self._left = ''      def readable(self):         return True      def _read1(self, n=None):         while not self._left:             try:                 self._left = next(self._iter)             except StopIteration:                 break         ret = self._left[:n]         self._left = self._left[len(ret):]         return ret      def read(self, n=None):         l = []         if n is None or n < 0:             while True:                 m = self._read1()                 if not m:                     break                 l.append(m)         else:             while n > 0:                 m = self._read1(n)                 if not m:                     break                 n -= len(m)                 l.append(m)         return ''.join(l)      def readline(self):         l = []         while True:             i = self._left.find('\n')             if i == -1:                 l.append(self._left)                 try:                     self._left = next(self._iter)                 except StopIteration:                     self._left = ''                     break             else:                 l.append(self._left[:i+1])                 self._left = self._left[i+1:]                 break         return ''.join(l) 
like image 154
Matt Joiner Avatar answered Sep 21 '22 05:09

Matt Joiner