I am learninig spring webflux and I've read the following series of articles(first, second, third)
In the third Article I faced the following text:
Remember the same application code runs on Tomcat, Jetty or Netty. Currently, the Tomcat and Jetty support is provided on top of Servlet 3.1 asynchronous processing, so it is limited to one request per thread. When the same code runs on the Netty server platform that constraint is lifted, and the server can dispatch requests sympathetically to the web client. As long as the client doesn’t block, everyone is happy. Performance metrics for the netty server and client probably show similar characteristics, but the Netty server is not restricted to processing a single request per thread, so it doesn’t use a large thread pool and we might expect to see some differences in resource utilization. We will come back to that later in another article in this series.
First of all I don't see newer article in the series although it was written in 2016. It is clear for me that tomcat has 100 threads by default for handling requests and one thread handle one request in the same time but I don't understand phrase it is limited to one request per thread What does it mean?
Also I would like to know how Netty works for that concrete case(I want to understand difference with Tomcat). Can it handle 2 requests per thread?
Currently there are 2 basic concepts to handle parallel access to a web-server with various advantages and disadvantages:
The first concept of blocking, multi-threaded server has a finite set amount of threads in a pool. Every request will get assigned to specific thread and this thread will be assigned until the request has been fully served. This is basically the same as how a the checkout queues in a super market works, a customer at a time with possible parallel lines. In most circumstances a request in a web server will be cpu-idle for the majority of the time while processing the request. This is due the fact that it has to wait for I/O: read the socket, write to the db (which is also basically IO) and read the result and write to the socket. Additionally using/creating a bunch of threads is slow (context switching) and requires a lot of memory. Therefore this concept often does not use the hardware resources it has very efficiently and has a hard limit on how many clients can be served in parallel. This property is misused in so called starvation attacks, e.g. the slow loris, an attack where usually a single client can DOS a big multi-threaded web-server with little effort.
Most "conventional" web server work that way, e.g. older tomcat, Apache Webserver, and everything Servlet
older than 3 or 3.1 etc.
In contrast a non-blocking web-server can serve multiple clients with only a single thread. That is because it uses the non-blocking kernel I/O features. These are just kernel calls which immediately return and call back when something can be written or read, making the cpu free to do other work instead. Reusing our supermarket metaphor, this would be like, when a cashier needs his supervisor to solve a problem, he does not wait and block the whole lane, but starts to check out the next customer until the supervisor arrives and solves the problem of the first customer.
This is often done in an event loop or higher abstractions as green-threads or fibers. In essence such servers can't really process anything concurrently (of course you can have multiple non-blocking threads), but they are able to serve thousands of clients in parallel because the memory consumption will not scale as drastically as with the multi-thread concept (read: there is no hard limit on max parallel clients). Also there is no thread context-switching. The downside is, that non-blocking code is often more complex to read and write (e.g. callback-hell) and doesn't prefrom well in situations where a request does a lot of cpu-expensive work.
Most modern "fast" web-servers and framework facilitate non-blocking concepts: Netty, Vert.x, Webflux, nginx, servlet 3.1+, Node, Go Webservers.
As a side note, looking at this benchmark page you will see that most of the fastest web-servers are usually non-blocking ones: https://www.techempower.com/benchmarks/
When using Servlet 2.5, Servlet containers will assign a request to a thread until that request has been fully processed.
When using Servlet 3.0 async processing, the server can dispatch the request processing in a separate thread pool while the request is being processed by the application. However, when it comes to I/O, work always happens on a server thread and it is always blocking. This means that a "slow client" can monopolize a server thread, since the server is blocked while reading/writing to that client with a poor network connection.
With Servlet 3.1, async I/O is allowed and in that case the "one request/thread" model isn't anymore. At any point a bit request processing can be scheduled on a different thread managed by the server.
Servlet 3.1+ containers offer all those possibilities with the Servlet API. It's up to the application to leverage async processing, or non-blocking I/O. In the case of non-blocking I/O, the paradigm change is important and it's really challenging to use.
With Spring WebFlux - Tomcat, Jetty and Netty don't have the exact same runtime model, but they all support reactive backpressure and non-blocking I/O.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With