asynchronous http request handling with tomcat and spring

Question

It's my first SO question so be patient with me :)

I'm trying to create a service that:

Receives HTTP GET requests containing a URL to query
For a single GET request the service extracts the URL
Queries a local DB about the URL
If a result was found in the DB it will return it to the client and if not it will need to query some external services (that may take relatively long time to respond)
Return the result of the URL to the client

I'm running this on a virtual machine and Tomcat7 with spring. I'll apologize in advance and mention that I'm pretty new to Tomcat

Anyway, I'm expecting a lot of concurrent GET requests to this service (hundreds of thousands of simultaneous requests) What I'm basically trying to achieve is to make this service as scalable as possible (and if that's not possible then at least a service that can handle hundreds of thousands of simultaneous requests)

I've been reading A LOT about asynchronous requests handling in services and especially in Tomcat but I have some things that are still unclear to me:

From the official tomcat website it seems that Tomcat contains number of acceptor threads and number of working threads. If so, why should I use AsyncContext? Whats the benefit of releasing a Tomcat's working thread and occupying a different thread in my application to do the exact same actions? (there's still 1 active thread in the system)
Somewhat similar to the first question but are there any benefits for creating the AsyncContext and using it with a different thread? (a thread from a thread pool created in my application)
Regarding the same issue, I've seen here that I can also return a Callable or a DeferredResult and process it with either one of Tomcat's threads or with one of my own threads. Are there any benefits for returning a Callable or using a DeferredResult over just processing the AsyncContext from the requests?
Also, If I decide to return a callable, from what thread pool does Tomcat gets the thread to process my callable? Are the threads being used here the same working threads from Tomcat that I previously mentioned? If so, what benefits do I get from releasing one Tomcat working thread and using a different one instead?
I've seen from Oracle's documentation that I can pass AsyncContext a Runnable object that will be processed concurrently, From where do the threads used to execute this Runnable come from? Do I have any control over it? Also, any benefits to passing the AsyncContext a Runnable over just passing the AsyncContext to one my threads?

I apologize for asking so many questions regarding the same things but me and my colleagues are arguing over these things for over a week without any concrete answer.

I have 1 more general question: What do you think is the best way to make the service I described scalable? (putting aside adding more machines at the moment), could you post any examples or references for the purposed solution?

I'd post more links of links I've been looking at but my current reputation doesn't allow it. I'll be grateful for any understandable references or for concrete examples and I'll obviously be happy to clarify on any relevant issue

Cheers!

Tim Boudreau · Accepted Answer

There are a lot of questions packed into this, but I'll try to address some of them.

Asynchronous I/O is a good thing, especially on servers that serve large volumes of requests - it allows you to use fewer threads to process more requests. In the case of a proxy such as you are writing, you really want your HTTP client (that makes the requests to foreign URLs) to be asynchronous as well, so that neither processing the request nor receiving the remote response involves blocking I/O.

That said, you may have a harder time doing this stuff with Tomcat or Java EE servers in general, which have had asynchronous I/O bolted onto them as an afterthought, than using a framework like Netty that is asynchronous from the ground up. As the author of a framework which builds on top of Netty, I'm a bit biased.

To demonstrate how little code you'd need to do what you describe, I wrote a small server that does what you describe here in 3 Java source files and put it on github - it builds a standalone JAR you can run with java -jar to try it out, and I tried to comment it clearly.

What it comes down to is, networked applications spend most of their time waiting for I/O to happen. In the case of a proxy in particular, with traditional, threaded I/O, you would get a request, and the thread that received the request would be responsible for answering it synchronously - that means, if it has to make a network request to another server, that thread is blocked waiting for the answer to come from the remote server. Meaning that thread can't be used for anything else. So, if you have 10 threads, and all of them are waiting on responses, your server can't answer any more requests until one of them finishes and frees up a thread. With asynchronous I/O, you get a callback when some I/O completes. In other words, instead of standing still until the OS flushes your data to the socket and out the network card, your code simply gets a friendly tap on the shoulder when there is something to do (like a response arriving from your proxy request). While your code is waiting for that HTTP request to complete, the thread that sent the proxy request is free to be used to handle another request That means one thread can do a little work on one request, do a little work on another, and another, and eventually finish the first request. Since threads are a finite resource provided by your operating system, this allows you to do a lot more with a lot less hardware.

As to Callable vs. DeferredResult, using a Callable just moves when the work happens around (the Callable gets executed later, on some thread or other, but is still expected to do return a result synchronously); DeferredResult sounds more like what you'd need, since that allows your code to go off and do whatever work it wants, and then set the result (triggering completion of the response) whenever it has something to set.

Honestly, I think if you want to implement this really efficiently, you'd be better off staying away from the Java EE stack - so much of it has baked in assumptions that I/O is synchronous that trying to do async stuff with it is swimming upstream (for example, JDBC has synchronous I/O baked into its bones - if you really want this to scale and you want to use an SQL database, you'd be better off with something like this ).

For another example of using Netty for this sort of thing, see the tiny-maven-proxy project - the code is less pretty, but it shows an example of doing an HTTP proxy where the response body is fed to the client chunk-by-chunk, as it arrives - so you never actually pull the full response body into memory, meaning even requests with huge responses won't run the proxy out of memory. Tiny-maven-proxy also caches on the filesystem. I didn't do those things in the demo because it would have made the code more complicated.

asynchronous http request handling with tomcat and spring

Tags:

java

spring-mvc

multithreading

tomcat

asynccontroller

Gideon

Video Answer

1 Answers

Tim Boudreau

Recent Activity

Donate For Us

asynchronous http request handling with tomcat and spring

Tags:

java

spring-mvc

multithreading

tomcat

asynccontroller

Gideon

Video Answer

1 Answers

Tim Boudreau

Related questions

Recent Activity

Donate For Us