Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python -- callable iterator size?

I am looking through some text file for a certain string with the method.

re.finditer(pattern,text) I would like to know when this returns nothing. meaning that it could find nothing in the passed text.

I know that callable iterators, have next() and __iter__

I would like to know if I could get the size or find out if it returns no string matching my pattern.

like image 736
myusuf3 Avatar asked Jul 27 '10 19:07

myusuf3


People also ask

How do I find the length of an iterator in python?

To get the length of an iterator in Python:Use the list() class to convert the iterator to a list. Pass the list to the len() function, e.g. len(list(gen)) . Note that once the iterator is converted to a list, it is exhausted.

What is a callable iterator?

An iterator is just an object with a next method. Every time you call it, it returns the next item in a collection. If you need to access arbitrary indexes, you will pretty much have to convert it into a list.

Can you get the length of an iterable?

Python has got in-built method – len() to find the size of the list i.e. the length of the list. The len() method accepts an iterable as an argument and it counts and returns the number of elements present in the list.

What is iterator in python?

An iterator is an object that contains a countable number of values. An iterator is an object that can be iterated upon, meaning that you can traverse through all the values. Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__() .


1 Answers

This solution uses less memory, because it does not save intermediate results, as do other solutions that use list:

sum(1 for _ in re.finditer(pattern, text))

All older solutions have the disadvantage of consuming a lot of memory if the pattern is very frequent in the text, like pattern '[a-z]'.

Test case:

pattern = 'a'
text = 10240000 * 'a'

This solution with sum(1 for ...) uses approximately only the memory for the text as such, that is len(text) bytes. The previous solutions with list can use approximately 58 or 110 times more memory than is necessary. It is 580 MB for 32-bit resp. 1.1 GB for 64-bit Python 2.7.

like image 83
hynekcer Avatar answered Sep 17 '22 23:09

hynekcer