Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does python thread consume so much memory?

Why does python thread consumes so much memory?

I measured that spawning one thread consumes 8 megs of memory, almost as big as a whole new python process!

OS: Ubuntu 10.10

Edit: due to popular demand I'll give some extraneous examples, here it is:

from os import getpid
from time import sleep
from threading import Thread

def nap():
    print 'sleeping child'
    sleep(999999999)

print getpid()
child_thread = Thread(target=nap)
sleep(999999999)

On my box, pmap pid will give 9424K

Now, let's run the child thread:

from os import getpid
from time import sleep
from threading import Thread

def nap():
    print 'sleeping child'
    sleep(999999999)

print getpid()
child_thread = Thread(target=nap)
child_thread.start()             # <--- ADDED THIS LINE
sleep(999999999)

Now pmap pid will give 17620K

So, the cost for the extra thread is 17620K - 9424K = 8196K

ie. 87% of running a whole new separate process!

Now isn't that just, wrong?

like image 735
Ron Avatar asked Apr 12 '11 14:04

Ron


People also ask

How much memory does a Python thread use?

Why does python thread consumes so much memory? I measured that spawning one thread consumes 8 megs of memory, almost as big as a whole new python process! ie. 87% of running a whole new separate process!

Why Python consumes more memory?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

Why is Python threading slow?

This is due to the Python GIL being the bottleneck preventing threads from running completely concurrently. The best possible CPU utilisation can be achieved by making use of the ProcessPoolExecutor or Process modules which circumvents the GIL and make code run more concurrently.

Does multithreading reduce memory usage?

Technically yes. You can use more threads to read from different places in memory. CPU is faster so it could issues lots of reads, say one read per thread, until the result from the first read comes back. Then start processing the requests.


1 Answers

This is not Python-specific, and has to do with the separate stack that gets allocated by the OS for every thread. The default maximum stack size on your OS happens to be 8MB.

Note that the 8MB is simply a chunk of address space that gets set aside, with very little memory committed to it initially. Additional memory gets committed to the stack when required, up to the 8MB limit.

The limit can be tweaked using ulimit -s, but in this instance I see no reason to do this.

As an aside, pmap shows address space usage. It isn't a good way to gauge memory usage. The two concepts are quite distinct, if related.

like image 83
NPE Avatar answered Oct 11 '22 13:10

NPE