Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multithread GAE servlets to handle concurrent users

I'd like to multithread my GAE servlets so that the same servlet on the same instance can handle up to 10 (on frontend instance I believe the max # threads is 10) concurrent requests from different users at the same time, timeslicing between each of them.

public class MyServlet implements HttpServlet {
    private Executor executor;

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        if(executor == null) {
            ThreadFactory threadFactory = ThreadManager.currentRequestFactory();
            executor = Executors.newCachedThreadPoolthreadFactory);
        }

        MyResult result = executor.submit(new MyTask(request));

        writeResponseAndReturn(response, result);
    }
}

So basically when GAE starts up, the first time it gets a request to this servlet, an Executor is created and then saved. Then each new servlet request uses that executor to spawn a new thread. Obviously everything inside MyTask must be thread-safe.

What I'm concerned about is whether or not this truly does what I'm hoping it does. That is, does this code create a non-blocking servlet that can handle multiple requests from multiple users at the same time? If not, why and what do I need to do to fix it? And, in general, is there anything else that a GAE maestro can spot that is dead wrong? Thanks in advance.

like image 606
IAmYourFaja Avatar asked Feb 13 '13 22:02

IAmYourFaja


2 Answers

I don't think your code would work.

The doGet method is running in threads managed by the servlet container. When a request comes in, a servlet thread is occupied, and it will not be released until doGet method return. In your code, the executor.submit would return a Future object. To get the actual result you need to invoke get method on the Future object, and it would block until the MyTask finishes its task. Only after that, doGet method returns and new requests can kick in.

I am not familiar with GAE, but according to their docs, you can declare your servlet as thread-safe and then the container will dispatch multiple requests to each web server in parallel:

<!-- in appengine-web.xml -->
<threadsafe>true</threadsafe>
like image 124
ericson Avatar answered Nov 03 '22 10:11

ericson


You implicitly asked two questions, so let me answer both:

1. How can I get my AppEngine Instance to handle multiple concurrent requests?

You really only need to do two things:

  1. Add the statement <threadsafe>true</threadsafe> to your appengine-web.xml file, which you can find in the war\WEB-INF folder.
  2. Make sure that the code inside all your request handlers is actually thread-safe, i.e. use only local variables in your doGet(...), doPost(...), etc. methods or make sure you synchronize all access to class or global variables.

This will tell the AppEngine instance server framework that your code is thread-safe and that you are allowing it to call all of your request handlers multiple times in different threads to handle several requests at the same time. Note: AFAIK, It is not possible to set this one a per-servlet basis. So, ALL your servlets need to be thread-safe!

So, in essence, the executor-code you posted is already included in the server code of each AppEngine instance, and actually calls your doGet(...) method from inside the run method of a separate thread that AppEngine creates (or reuses) for each request. Basically doGet() already is your MyTask().

The relevant part of the Docs is here (although it doesn't really say much): https://developers.google.com/appengine/docs/java/config/appconfig#Using_Concurrent_Requests

2. Is the posted code useful for this (or any other) purpose?

AppEngine in its current form does not allow you to create and use your own threads to accept requests. It only allows you to create threads inside your doGet(...) handler, using the currentRequestThreadFactory() method you mentioned, but only to do parallel processing for this one request and not to accept a second one in parallel (this happens outside doGet()).

The name currentRequestThreadFactory() might be a little misleading here. It does not mean that it will return the current Factory of RequestThreads, i.e. threads that handle requests. It means that it returns a Factory that can create Threads inside the currentRequest. So, unfortunately it is actually not even allowed to use the returned ThreadFactory beyond the scope of the current doGet() execution, like you are suggesting by creating an Executor based on it and keeping it around in a class variable.

For frontend instances, any threads you create inside a doGet() call will get terminated immediately when your doGet() method returns. For backend instances, you are allowed to create threads that keep running, but since you are not allowed to open server sockets for accepting requests inside these threads, these will still not allow you to manage the request handling yourself.

You can find more details on what you can and cannot do inside an appengine servlet here:

The Java Servlet Environment - The Sandbox (specifically the Threads section)

For completeness, let's see how your code can be made "legal":

The following should work, but it won't make a difference in terms of your code being able to handle multiple requests in parallel. That will be determined solely by the <threadsafe>true</threadsafe> setting in you appengine-web.xml. So, technically, this code is just really inefficient and splits an essentially linear program flow across two threads. But here it is anyways:

public class MyServlet implements HttpServlet {

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        ThreadFactory threadFactory = ThreadManager.currentRequestThreadFactory();
        Executor executor = Executors.newCachedThreadPool(threadFactory);

        Future<MyResult> result = executor.submit(new MyTask(request)); // Fires off request handling in a separate thread

        writeResponse(response, result.get()); // Waits for thread to complete and builds response. After that, doGet() returns
    }
}

Since you are already inside a separate thread that is specific to the request you are currently handling, you should definitely save yourself the "thread inside a thread" and simply do this instead:

public class MyServlet implements HttpServlet {

    @Override
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        writeResponse(response, new MyTask(request).call()); // Delegate request handling to MyTask object in current thread and write out returned response
    }
}

Or, even better, just move the code from MyTask.call() into the doGet() method. ;)

Aside - Regarding the limit of 10 simultaneous servlet threads you mentioned:

This is a (temporary) design-decision that allows Google to control the load on their servers more easily (specifically the memory use of servlets).

You can find more discussion on those issues here:

  • Issue 7927: Allow configurable limit of concurrent requests per instance
  • Dynamic Backend Instance Scaling
  • If your bill shoots up due to increased latency, you may not be refunded the charges incurred

This topic has been bugging the heck out of me, too, since I am a strong believer in ultra-lean servlet code, so my usual servlets could easily handle hundreds, if not thousands, of concurrent requests. Having to pay for more instances due to this arbitrary limit of 10 threads per instance is a little annoying to me to say the least. But reading over the links I posted above, it sounds like they are aware of this and are working on a better solution. So, let's see what announcements Google I/O 2013 will bring in May... :)

like image 39
14 revs Avatar answered Nov 03 '22 10:11

14 revs