I am troubled with the following concept: Most books/docs describe how robust servers are multithreaded and that the most common approach is to start a new thread to serve each new client. E.g. a thread is dedicated to each new connection. But how is this actually implemented in big systems? If we have a server that accepts requests from 100000 clients, it has started 100000 threads? Is this realistic? Aren't there limits on how many threads can run in a server? Additionally the overhead of context switching and synchronization, doesn't it degrade performance? Is it implemented as a mix of queues and threads? In this case is the number of queues fixed? Can anybody enlighten me on this, and perhaps give me a good reference that describes these?
Thanks!
The common method is to use thread pools. A thread pool is a collection of already created threads. When a new request gets to the server it is assigned a spare thread from the pool. When the request is handled, the thread is returned to the pool.
The number of threads in a pool is configured depending on the characteristics of the application. For example, if you have an application that is CPU bound you will not want too many threads since context switches will decrease performance. On the other hand, if you have a DB or IO bound application you want more threads since much time is spent waiting. Hence, more threads will utilize the CPU better.
Google "thread pools" and you will for sure find much to read about the concept.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With