This may be a stupid question but I will ask it anyway. I have a generator object:
>>> def gen():
... for i in range(10):
... yield i
...
>>> obj=gen()
I can measure it's size:
>>> obj.__sizeof__()
24
It is said that generators get consumed:
>>> for i in obj:
... print i
...
0
1
2
3
4
5
6
7
8
9
>>> obj.__sizeof__()
24
...but obj.__sizeof__()
remains the same.
With strings it works as I expected:
>>> 'longstring'.__sizeof__()
34
>>> 'str'.__sizeof__()
27
I would be thankful if someone could enlighten me.
Python generators are a simple way of creating iterators. All the work we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).
Now let's look at the __sizeof__() method. It returns the size of the object without any overhead.
You need to call next() or loop through the generator object to access the values produced by the generator expression. When there isn't the next value in the generator object, a StopIteration exception is thrown. A for loop can be used to iterate the generator object.
A generator is a special type of function which does not return a single value, instead, it returns an iterator object with a sequence of values. In a generator function, a yield statement is used rather than a return statement. The following is a simple generator function. Example: Generator Function.
__sizeof__()
does not do what you think it does. The method returns the internal size in bytes for the given object, not the number of items a generator is going to return.
Python cannot beforehand know the size of a generator. Take for example the following endless generator (example, there are better ways to create a counter):
def count():
count = 0
while True:
yield count
count += 1
That generator is endless; there is no size assignable to it. Yet the generator object itself takes memory:
>>> count.__sizeof__()
88
You don't normally call __sizeof__()
you leave that to the sys.getsizeof()
function, which also adds garbage collector overhead.
If you know a generator is going to be finite and you have to know how many items it returns, use:
sum(1 for item in generator)
but note that that exhausts the generator.
As said in other answers, __sizeof__
returns a different thing.
Only some iterators have methods that return the number of not returned elements. For example listiterator
has a corresponding __length_hint__
method:
>>> L = [1,2,3,4,5]
>>> it = iter(L)
>>> it
<listiterator object at 0x00E65350>
>>> it.__length_hint__()
5
>>> help(it.__length_hint__)
Help on built-in function __length_hint__:
__length_hint__(...)
Private method returning an estimate of len(list(it)).
>>> it.next()
1
>>> it.__length_hint__()
4
__sizeof__
returns the memory size of an object in bytes, not the length of a generator, which is impossible to determine up front as generators can grow indefinitely.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With