Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python generator objects: __sizeof__()

This may be a stupid question but I will ask it anyway. I have a generator object:

>>> def gen():
...     for i in range(10):
...         yield i
...         
>>> obj=gen()

I can measure it's size:

>>> obj.__sizeof__()
24

It is said that generators get consumed:

>>> for i in obj:
...     print i
...     
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24

...but obj.__sizeof__() remains the same.

With strings it works as I expected:

>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27

I would be thankful if someone could enlighten me.

like image 871
root Avatar asked Sep 18 '12 13:09

root


People also ask

What is a Python generator object?

Python generators are a simple way of creating iterators. All the work we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

What does __ sizeof __ do in Python?

Now let's look at the __sizeof__() method. It returns the size of the object without any overhead.

How do you access the generator object in Python?

You need to call next() or loop through the generator object to access the values produced by the generator expression. When there isn't the next value in the generator object, a StopIteration exception is thrown. A for loop can be used to iterate the generator object.

What is a generator object?

A generator is a special type of function which does not return a single value, instead, it returns an iterator object with a sequence of values. In a generator function, a yield statement is used rather than a return statement. The following is a simple generator function. Example: Generator Function.


3 Answers

__sizeof__() does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.

Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):

def count():
    count = 0
    while True:
        yield count
        count += 1

That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:

>>> count.__sizeof__()
88

You don't normally call __sizeof__() you leave that to the sys.getsizeof() function, which also adds garbage collector overhead.

If you know a generator is going to be finite and you have to know how many items it returns, use:

sum(1 for item in generator)

but note that that exhausts the generator.

like image 165
Martijn Pieters Avatar answered Oct 02 '22 06:10

Martijn Pieters


As said in other answers, __sizeof__ returns a different thing.

Only some iterators have methods that return the number of not returned elements. For example listiterator has a corresponding __length_hint__ method:

>>> L = [1,2,3,4,5]
>>> it = iter(L)
>>> it
<listiterator object at 0x00E65350>
>>> it.__length_hint__()
5
>>> help(it.__length_hint__)
Help on built-in function __length_hint__:

__length_hint__(...)
    Private method returning an estimate of len(list(it)).

>>> it.next()
1
>>> it.__length_hint__()
4
like image 44
ovgolovin Avatar answered Sep 30 '22 06:09

ovgolovin


__sizeof__ returns the memory size of an object in bytes, not the length of a generator, which is impossible to determine up front as generators can grow indefinitely.

like image 1
Hans Then Avatar answered Oct 01 '22 06:10

Hans Then