Why does python thread consumes so much memory?
I measured that spawning one thread consumes 8 megs of memory, almost as big as a whole new python process!
OS: Ubuntu 10.10
Edit: due to popular demand I'll give some extraneous examples, here it is:
from os import getpid
from time import sleep
from threading import Thread
def nap():
print 'sleeping child'
sleep(999999999)
print getpid()
child_thread = Thread(target=nap)
sleep(999999999)
On my box, pmap pid will give 9424K
Now, let's run the child thread:
from os import getpid
from time import sleep
from threading import Thread
def nap():
print 'sleeping child'
sleep(999999999)
print getpid()
child_thread = Thread(target=nap)
child_thread.start() # <--- ADDED THIS LINE
sleep(999999999)
Now pmap pid will give 17620K
So, the cost for the extra thread is 17620K - 9424K = 8196K
ie. 87% of running a whole new separate process!
Now isn't that just, wrong?
Why does python thread consumes so much memory? I measured that spawning one thread consumes 8 megs of memory, almost as big as a whole new python process! ie. 87% of running a whole new separate process!
Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.
This is due to the Python GIL being the bottleneck preventing threads from running completely concurrently. The best possible CPU utilisation can be achieved by making use of the ProcessPoolExecutor or Process modules which circumvents the GIL and make code run more concurrently.
Technically yes. You can use more threads to read from different places in memory. CPU is faster so it could issues lots of reads, say one read per thread, until the result from the first read comes back. Then start processing the requests.
This is not Python-specific, and has to do with the separate stack that gets allocated by the OS for every thread. The default maximum stack size on your OS happens to be 8MB.
Note that the 8MB is simply a chunk of address space that gets set aside, with very little memory committed to it initially. Additional memory gets committed to the stack when required, up to the 8MB limit.
The limit can be tweaked using ulimit -s
, but in this instance I see no reason to do this.
As an aside, pmap
shows address space usage. It isn't a good way to gauge memory usage. The two concepts are quite distinct, if related.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With