Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple unary rpc calls vs long-running bidirectional streaming in grpc?

I have a use case where many clients need to keep sending a lot of metrics to the server (almost perpetually). The server needs to store these events, and process them later. I don't expect any kind of response from the server for these events.
I'm thinking of using grpc for this. Initially, I thought client-side streaming would do (like how envoy does), but the issue is that client side streaming cannot ensure reliable delivery at application level (i.e. if the stream closed in between, how many messages that were sent were actually processed by the server) and I can't afford this.
My thought process is, I should either go with bidi streaming, with acks in the server stream, or multiple unary rpc calls (perhaps with some batching of the events in a repeated field for performance).
Which of these would be better?

like image 503
avmohan Avatar asked Jun 26 '19 06:06

avmohan


People also ask

What is gRPC bidirectional streaming?

In a gRPC bidirectional streaming scenario, the gRPC service and the client operate when each other sends a sequence of messages using a read-write stream. In such scenarios, the two streams operate independently. Therefore, clients and servers can read and write in any order.

Can gRPC server handle multiple clients?

Multiple gRPC clients can be created from a channel, including different types of clients. A channel and clients created from the channel can safely be used by multiple threads. Clients created from the channel can make multiple simultaneous calls.

What gRPC uses for better performance?

gRPC uses Http/2 (You can learn more about Http/2 in detail here). It uses a faster binary protocol which makes it more efficient for computers to parse. It supports Multiplexing over a single connection (It means multiple requests can be sent without request blocking each other).

What is the difference between RPC and gRPC?

gRPC is a framework that uses RPC to communicate. RPC is not Protobuf but instead Protobuf can use RPC and gRPC is actually Protobuf over RPC. You don't need to use Protobuf to create RPC services within your app. This is a good idea if you are doing libraries/apps from small to medium size.


2 Answers

Streams ensure that messages are delivered in the order that they were sent, this would mean that if there are concurrent messages, there will be some kind of bottleneck.

Google’s gRPC team advises against using streams over unary for performance, but nevertheless, there have been arguments that theoretically, streams should have lower overhead. But that does not seem to be true.

For a lower number of concurrent requests, both seem to have comparable latencies. However, for higher loads, unary calls are much more performant.

There is no apparent reason we should prefer streams over unary, given using streams comes with additional problems like

  • Poor latency when we have concurrent requests
  • Complex implementation at the application level
  • Lack of load balancing: the client will connect with one server and ignore any new servers
  • Poor resilience to network interruptions (even small interruptions in TCP connections will fail the connection)

Some benchmarks here: https://nshnt.medium.com/using-grpc-streams-for-unary-calls-cd64a1638c8a

like image 28
Nishant Avatar answered Jan 14 '23 17:01

Nishant


the issue is that client side streaming cannot ensure reliable delivery at application level (i.e. if the stream closed in between, how many messages that were sent were actually processed by the server) and I can't afford this

This implies you need a response. Even if the response is just an acknowledgement, it is still a response from gRPC's perspective.

The general approach should be "use unary," unless large enough problems can be solved by streaming to overcome their complexity costs. I discussed this at 2018 CloudNativeCon NA (there's a link to slides and YouTube for the video).

For example, if you have multiple backends then each unary RPC may be sent to a different backend. That may cause a high overhead for those various backends to synchronize themselves. A streaming RPC chooses a backend at the beginning and continues using the same backend. So streaming might reduce the frequency of backend synchronization and allow higher performance in the service implementation. But streaming adds complexity when errors occur, and in this case it will cause the RPCs to become long-lived which are more complicated to load balance. So you need to weigh whether the added complexity from streaming/long-lived RPCs provides a large enough benefit to your application.

We don't generally recommend using streaming RPCs for higher gRPC performance. It is true that sending a message on a stream is faster than a new unary RPC, but the improvement is fixed and has higher complexity. Instead, we recommend using streaming RPCs when it would provide higher application (your code) performance or lower application complexity.

like image 198
Eric Anderson Avatar answered Jan 14 '23 17:01

Eric Anderson