Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the thread limitations when working on Linux compared to processes for network/IO-bound apps?

I've heard that under linux on multicore server it would be impossible to reach top performance when you have just 1 process but multiple threads because Linux have some limitations on the IO, so that 1 process with 8 threads on 8-core server might be slower than 8 processes.

Any comments? Are there other limitation which might slow the applications? The applications is a network C++ application, serving 100s of clients, with some disk IO.

Update: I am concerned that there are some more IO-related issues other than the locking I implement myself... Aren't there any issues doing simultanious network/disk IO in several threads?

like image 271
BarsMonster Avatar asked Aug 31 '10 13:08

BarsMonster


People also ask

What is the advantage of using threads compared to processes?

On a multiprocessor system, multiple threads can concurrently run on multiple CPUs. Therefore, multithreaded programs can run much faster than on a uniprocessor system. They can also be faster than a program using multiple processes, because threads require fewer resources and generate less overhead.

How many threads can Linux handle?

This parameter is defined in the file /proc/sys/kernel/threads-max. Here, the output 63704 indicates that the kernel can execute a maximum of 63,704 threads.

Are threads more secure than processes?

Conversely, processes are safer and more secure than threads, because each process runs in its own virtual address space.

How many thread can a process have?

Every process has at least one thread, but there is no maximum number of threads a process can use. For specialized tasks, the more threads you have, the better your computer's performance will be. With multiple threads, a single process can handle a variety of tasks simultaneously.


1 Answers

Drawbacks of Threads

Threads:

  • Serialize on memory operations. That is the kernel, and in turn the MMU must service operations such as mmap() that perform page allocations.
  • Share the same file descriptor table. There is locking involved making changes and performing lookups in this table, which stores stuff like file offsets, and other flags. Every system call made that uses this table such as open(), accept(), fcntl() must lock it to translate fd to internal file handle, and when make changes.
  • Share some scheduling attributes. Processes are constantly evaluated to determine the load they're putting on the system, and scheduled accordingly. Lots of threads implies a higher CPU load, which the scheduler typically dislikes, and it will increase the response time on events for that process (such as reading incoming data on a socket).
  • May share some writable memory. Any memory being written to by multiple threads (especially slow if it requires fancy locking), will generate all kinds of cache contention and convoying issues. For example heap operations such as malloc() and free() operate on a global data structure (that can to some degree be worked around). There are other global structures also.
  • Share credentials, this might be an issue for service-type processes.
  • Share signal handling, these will interrupt the entire process while they're handled.

Processes or Threads?

  • If you want to make debugging easier, use threads.
  • If you are on Windows, use threads. (Processes are extremely heavyweight in Windows).
  • If stability is a huge concern, try to use processes. (One SIGSEGV/PIPE is all it takes...).
  • If threads aren't available, use processes. (Not so common now, but it did happen).
  • If your threads share resources that can't be use from multiple processes, use threads. (Or provide an IPC mechanism to allow communicating with the "owner" thread of the resource).
  • If you use resources that are only available on a one-per-process basis (and you one per context), obviously use processes.
  • If your processing contexts share absolutely nothing (such as a socket server that spawns and forgets connections as it accept()s them), and CPU is a bottleneck, use processes and single-threaded runtimes (which are devoid of all kinds of intense locking such as on the heap and other places).
  • One of the biggest differences between threads and processes is this: Threads use software constructs to protect data structures, processes use hardware (which is significantly faster).

Links

  • pthreads(7)
  • About Processes and Threads (MSDN)
  • Threads vs. Processes
like image 151
Matt Joiner Avatar answered Sep 26 '22 00:09

Matt Joiner