I have a service that transfers messages at a quite high rate. Currently it is served by akka-tcp and it makes 3.5M messages per minute. I decided to give grpc a try. Unfortunately it resulted in much smaller throughput: ~500k messages per minute an even less. Could you please recommend how to optimize it? My setup Hardware: 32 cores, 24Gb heap. grpc version: 1.25.0 Message format and endpoint Message is basically a binary blob. Client streams 100K - 1M and more messages into the same request (asynchronously), server doesn't respond with anything, client uses a no-op observer <pre class="prettyprint"><code>service MyService { rpc send (stream MyMessage) returns (stream DummyResponse); } message MyMessage { int64 someField = 1; bytes payload = 2; //not huge } message DummyResponse { } </code></pre> Problems: Message rate is low compared to akka implementation. I observe low CPU usage so I suspect that grpc call is actually blocking internally despite it says otherwise. Calling <code>onNext()</code> indeed doesn't return immediately but there is also GC on the table. I tried to spawn more senders to mitigate this issue but didn't get much of improvement. My findings Grpc actually allocates a 8KB byte buffer on each message when serializes it. See the stacktrace: <blockquote> java.lang.Thread.State: BLOCKED (on object monitor) at com.google.common.io.ByteStreams.createBuffer(ByteStreams.java:58) at com.google.common.io.ByteStreams.copy(ByteStreams.java:105) at io.grpc.internal.MessageFramer.writeToOutputStream(MessageFramer.java:274) at io.grpc.internal.MessageFramer.writeKnownLengthUncompressed(MessageFramer.java:230) at io.grpc.internal.MessageFramer.writeUncompressed(MessageFramer.java:168) at io.grpc.internal.MessageFramer.writePayload(MessageFramer.java:141) at io.grpc.internal.AbstractStream.writeMessage(AbstractStream.java:53) at io.grpc.internal.ForwardingClientStream.writeMessage(ForwardingClientStream.java:37) at io.grpc.internal.DelayedStream.writeMessage(DelayedStream.java:252) at io.grpc.internal.ClientCallImpl.sendMessageInternal(ClientCallImpl.java:473) at io.grpc.internal.ClientCallImpl.sendMessage(ClientCallImpl.java:457) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.onNext(ClientCalls.java:346) </blockquote> Any help with best practices on building high-throughput grpc clients appreciated.

I solved the issue by creating several <code>ManagedChannel</code> instances per destination. Despite articles say that a <code>ManagedChannel</code> can spawn enough connections itself so one instance is enough it's wasn't true in my case. Performance is in parity with akka-tcp implementation.

GRPC: make high-throughput client in Java/Scala

Tags:

java

scala

grpc

I have a service that transfers messages at a quite high rate.

Currently it is served by akka-tcp and it makes 3.5M messages per minute. I decided to give grpc a try. Unfortunately it resulted in much smaller throughput: ~500k messages per minute an even less.

Could you please recommend how to optimize it?

My setup

Hardware: 32 cores, 24Gb heap.

grpc version: 1.25.0

Message format and endpoint

Message is basically a binary blob. Client streams 100K - 1M and more messages into the same request (asynchronously), server doesn't respond with anything, client uses a no-op observer

service MyService {
    rpc send (stream MyMessage) returns (stream DummyResponse);
}

message MyMessage {
    int64 someField = 1;
    bytes payload = 2;  //not huge
}

message DummyResponse {
}

Problems: Message rate is low compared to akka implementation. I observe low CPU usage so I suspect that grpc call is actually blocking internally despite it says otherwise. Calling onNext() indeed doesn't return immediately but there is also GC on the table.

I tried to spawn more senders to mitigate this issue but didn't get much of improvement.

My findings Grpc actually allocates a 8KB byte buffer on each message when serializes it. See the stacktrace:

java.lang.Thread.State: BLOCKED (on object monitor) at com.google.common.io.ByteStreams.createBuffer(ByteStreams.java:58) at com.google.common.io.ByteStreams.copy(ByteStreams.java:105) at io.grpc.internal.MessageFramer.writeToOutputStream(MessageFramer.java:274) at io.grpc.internal.MessageFramer.writeKnownLengthUncompressed(MessageFramer.java:230) at io.grpc.internal.MessageFramer.writeUncompressed(MessageFramer.java:168) at io.grpc.internal.MessageFramer.writePayload(MessageFramer.java:141) at io.grpc.internal.AbstractStream.writeMessage(AbstractStream.java:53) at io.grpc.internal.ForwardingClientStream.writeMessage(ForwardingClientStream.java:37) at io.grpc.internal.DelayedStream.writeMessage(DelayedStream.java:252) at io.grpc.internal.ClientCallImpl.sendMessageInternal(ClientCallImpl.java:473) at io.grpc.internal.ClientCallImpl.sendMessage(ClientCallImpl.java:457) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.onNext(ClientCalls.java:346)

Any help with best practices on building high-throughput grpc clients appreciated.

538

asked Nov 08 '19 10:11

simpadjo

1 Answers

I solved the issue by creating several ManagedChannel instances per destination. Despite articles say that a ManagedChannel can spawn enough connections itself so one instance is enough it's wasn't true in my case.

Performance is in parity with akka-tcp implementation.

162

answered Sep 17 '22 14:09

simpadjo

Related questions
                            
                                How to make a video file by capturing the animated view in android or java?
                            
                                NoSuchMethodError: <init> in com.sun.glass.ui.win.WinApplication.staticScreen_getScreens
                            
                                JVM language interoperability
                            
                                Java generics - too complicated? How to simplify?
                            
                                Indeterminate ProgressBar does not animate when part of a Dialog (JavaFX 10)
                            
                                vertx - how to read stream from executable program async
                            
                                How to get and display Wordpress featured media and author image?
                            
                                How to perform Mouse Wheel scrolling over HTML5 Canvas in Selenium?
                            
                                Should JavaDelegate classes for Camunda BPM be thread safe?
                            
                                how to solve Caused by: java.lang.ClassNotFoundException: javax.xml.bind.JAXBException migrating to Java 11(Openjdk-11.0.1 )
                            
                                Tests run under JUnit 4 but not JUnit 5 — Compiles clean, but 0 tests execute
                            
                                Mapstruct - Ambiguous mapping methods found for mapping property
                            
                                Is there a performance difference between multiple "if" statements vs. "if else if" for mutually exclusive conditions?
                            
                                Passing JWT token to SockJS
                            
                                Using .p12 file to execute request to rest server
                            
                                How to return by value from native function?
                            
                                javax.imageio.IIOException: Missing Huffman code table entry while Adding text to an jpg image
                            
                                Difference between serial and parallel execution with parallelism=1
                            
                                Why you should never use synchronized on Optional java object
                            
                                In Java, when should we use private instance methods in interfaces?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

GRPC: make high-throughput client in Java/Scala

Tags:

java

scala

grpc

simpadjo

People also ask

1 Answers

simpadjo

Recent Activity

Donate For Us