I just reviewed some really terrible code - code that sends messages on a serial port by creating a new thread to package and assemble the message in a new thread for every single message sent. Yes, for every message a pthread is created, bits are properly set up, then the thread terminates. I haven't a clue why anyone would do such a thing, but it raises the question - how much overhead is there when actually creating a thread?
As you can see, creating a new thread only costs ~70 µs. This could be considered trivial in many, if not most, use cases. Relatively speaking it is more expensive than the alternatives and for some situations a thread pool or not using threads at all is a better solution. That's a great piece of code there.
Creating a new thread is not free. Exactly how expensive it might be depends on several parameters.
Multithreading still induces high virtualization overhead, mainly caused by synchronization, spinning at user level and NUMA management. The overhead is diverse in nature and embodiment as it is a function of many system and workload properties. System-level solutions are feasible, but often imply difficult trade-offs.
Creating a thread is expensive, and the stack requires memory. As well, if your process is using many threads, then context switching can kill performance.
To resurrect this old thread, I just did some simple test code:
#include <thread> int main(int argc, char** argv) { for (volatile int i = 0; i < 500000; i++) std::thread([](){}).detach(); return 0; }
I compiled it with g++ test.cpp -std=c++11 -lpthread -O3 -o test
. I then ran it three times in a row on an old (kernel 2.6.18) heavily loaded (doing a database rebuild) slow laptop (Intel core i5-2540M). Results from three consecutive runs: 5.647s, 5.515s, and 5.561s. So we're looking at a tad over 10 microseconds per thread on this machine, probably much less on yours.
That's not much overhead at all, given that serial ports max out at around 1 bit per 10 microseconds. Now, of course there's various additional thread losses one can get involving passed/captured arguments (although function calls themselves can impose some), cache slowdowns between cores (if multiple threads on different cores are battling over the same memory at the same time), etc. But in general I highly doubt the use case you presented will adversely impact performance at all (and could provide benefits, depending), despite having you already preemptively labeled the concept "really terrible code" without even knowing how much time it takes to launch a thread.
Whether it's a good idea or not depends a lot on the details of your situation. What else is the calling thread responsible for? What precisely is involved in preparing and writing out the packets? How frequently are they written out (with what sort of distribution? uniform, clustered, etc...?) and what's their structure like? How many cores does the system have? Etc. Depending on the details, the optimal solution could be anywhere from "no threads at all" to "shared thread pool" to "thread for each packet".
Note that thread pools aren't magic and can in some cases be a slowdown versus unique threads, since one of the biggest slowdowns with threads is synchronizing cached memory used by multiple threads at the same time, and thread pools by their very nature of having to look for and process updates from a different thread have to do this. So either your primary thread or child processing thread can get stuck having to wait if the processor isn't sure whether the other process has altered a section of memory. By contrast, in an ideal situation, a unique processing thread for a given task only has to share memory with its calling task once (when it's launched) and then they never interfere with each other again.
I have always been told that thread creation is cheap, especially when compared to the alternative of creating a process. If the program you are talking about does not have a lot of operations that need to run concurrently then threading might not be necessary, and judging by what you wrote this might well be the case. Some literature to back me up:
http://www.personal.kent.edu/~rmuhamma/OpSystems/Myos/threads.htm
Threads are cheap in the sense that
They only need a stack and storage for registers therefore, threads are cheap to create.
Threads use very little resources of an operating system in which they are working. That is, threads do not need new address space, global data, program code or operating system resources.
Context switching are fast when working with threads. The reason is that we only have to save and/or restore PC, SP and registers.
More of the same here.
In Operating System Concepts 8th Edition (page 155) the authors write about the benefits of threading:
Allocating memory and resources for process creation is costly. Because threads share the resource of the process to which they belong, it is more economical to create and context-switch threads. Empirically gauging the difference in overhead can be difficult, but in general it is much more time consuming to create and manage processes than threads. In Solaris, for example, creating a process is about thirty times slower than is creating a thread, and context switching is about five times slower.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With