When launching a Python grpc.server
, what's the difference between maximum_concurrent_rpcs
and the max_workers
used in the thread pool. If I want maximum_concurrent_rpcs=1
, should I still provide more than one thread to the thread pool?
In other words, should I match maximum_concurrent_rpcs
to my max_workers
, or should I provide more workers than max concurrent RPCs?
server = grpc.server(
thread_pool=futures.ThreadPoolExecutor(max_workers=1),
maximum_concurrent_rpcs=1,
)
gRPC Python wraps gRPC core, which uses multithreading for performance, and hence doesn't support fork() .
There are two possible solutions: Create a separate channel for each area of high load in the application. Use a pool of gRPC channels to distribute RPCs over multiple connections (channels must have different channel args to prevent re-use so define a use-specific channel arg such as channel number).
gRPC is an HTTP/2-based Remote Procedure Call (RPC) framework that uses protocol buffers ( protobuf ) as the underlying data serialization framework. It is an alternative to other language-neutral RPC frameworks such as Apache Thrift and Apache Arvo.
If your server already processing maximum_concurrent_rpcs
number of requests concurrently, and yet another request is received, the request will be rejected immediately.
If the ThreadPoolExecutor's max_workers
is less than maximum_concurrent_rpcs
then after all the threads get busy processing requests, the next request will be queued and will be processed when a thread finishes its processing.
I had the same question. To answer this, I debugged a bit what happens with maximum_concurrent_rpcs
. The debugging went to py36/lib/python3.6/site-packages/grpc/_server.py
in my virtualenv
. Search for concurrency_exceeded
. The bottom line is that if the server is already processing maximum_concurrent_rpcs
and another request arrives, it will be rejected:
# ...
elif concurrency_exceeded:
return _reject_rpc(rpc_event, cygrpc.StatusCode.resource_exhausted,
b'Concurrent RPC limit exceeded!'), None
# ...
I tried it with the gRPC
Python Quickstart example:
In the greeter_server.py
I modified the SayHello()
method:
# ...
def SayHello(self, request, context):
print("Request arrived, sleeping a bit...")
time.sleep(10)
return helloworld_pb2.HelloReply(message='Hello, %s!' % request.name)
# ...
and the serve()
method:
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10), maximum_concurrent_rpcs=2)
# ...
Then I opened 3 terminals and executed the client in them manually (as fast as I could using python greeter_client.py
:
As expected, for the first 2 clients, processing of the request started immediately (can be seen in the server's output), because there were plenty of threads available, but the 3rd client got rejected immediately (as expected) with StatusCode.RESOURCE_EXHAUSTED
, Concurrent RPC limit exceeded!
.
Now to test what happens when there are not enough threads given to ThreadPoolExecutor
I modified the max_workers
to be 1:
server = grpc.server(futures.ThreadPoolExecutor(max_workers=1), maximum_concurrent_rpcs=2)
I ran my 3 clients again roughly the same time as previously.
The results is that the first one got served immediately. The second one needed to wait 10 seconds (while the first one was served) and then it was served. The third one got rejected immediately.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With