I came across the Asynchronous processing of requests by Servlets, as I was exploring how a NodeJS application and a Java application handles a request.
From what I have read in different places:
The request will be received and processed by a HTTP thread from the Servlet Container and in case of blocking operations (like I/O), the request can be handed over to another Threadpool and the HTTP thread which received the request can go back to receive and process the next request.
The time-consuming blocking operation will now be taken up by a worker from the Threadpool.
If what I had understood is correct, I have the following question:
Even the thread that processes the blocking operation is going to wait for that operation to complete and hence blocking the resources(and number of threads processed is equal to the number of cores), if I am right.
What exactly is the gain here by using of asynchronous processing?
If not, enlighten me please.
I can explain the benefits in terms of Node.js (equally applicable elsewhere).
The Problem. Blocking Network IO.
Suppose you want to create a connection with your server, in order to read from the connection you will need a thread T1 which will read data over network for that connection, this read method is blocking i.e your thread will wait indefinitely till there is any data to read. Now suppose you have another connection around that time, now to handle this connection you have to create another Thread T2. Its quite possible that this thread may again be blocked for reading data on the second connection, so it means you can handle as many connections as you can handle threads in your system. This is called a Thread Per Request Model. Creating lot of threads will degrade your system performance due to lot of context switching and scheduling. This model doesn't scale well.
Solution :
A little Background, there is a method in FreeBSD/Linux called as kqueue/epoll. Both of these methods accepts a list of socketfd(as function params), the calling thread gets blocked till one or more sockets have data ready to read, and these methods return a sublist of those ready connections. Ref. http://austingwalters.com/io-multiplexing/
Now Assuming you got a feeling for the above methods. Imagine there is Thread Called as EventLoop which calls the above method epoll/kqueue.
So in java your code will look something like this.
/*Called by Event Loop Thread*/
while(true) {
/**
* socketFD socket on which your server is listening
* returns connection which are ready
*/
List<Connection> readyConnections = epoll( socketFd );
/**
* Workers thread will read data from connection
* which would be very fast as data is already ready for read
* So they don't need to wait.
*/
submitToWorkerThreads(readyConnections);
/**
* callback methods are queued by worker threads with data
* event loop threads call this methods
* this is where the main bottleneck is in case of Node.js
* if your callback have any consuming task lets say loop
* of 1M then you Event loop will be busy and you can't
* accept new connection. In practice in memory computation
* are very fast as compared to network io.
*/
executeCallBackMethodsfromQueue();
}
So now you see the above method can accept many more connections than Thread per request model also the worker threads are also not stuck as they will read only those connection which have data. When worker threads will read the whole data they will queue there response or data on a queue with a callback handler you provided at the time of listening. this callback method will again be executed By the Event Loop Thread.
The above approach has two disadvantages.
First disadvantage can be taken care of Clustered Node.js i.e kind one node.js process corresponding to each core of the cpu.
Anyways Have a look at vert.x this is kind of similar node.js but in java. Also explore Netty.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With