Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avro message for Google Cloud Pub-Sub?

What is a best data format for publishing and consuming to/from Pub-Sub? I am looking at Avro message format due to it's binary format. Usecases are there would be real time Microservice applications publishing Avro messages to pub-sub. Given that avro message is best suited when batching up messages(along with a schema attached with the binary message) and then publishing the messages, would that be a better suitable format for this usecase involving microservice?

like image 721
Roshan Fernando Avatar asked Sep 16 '25 11:09

Roshan Fernando


1 Answers

Google Cloud Documentation contains some JSON examples but when looking for efficiency the main suggestion is to use the available client libraries, except if your needs don't met what client libraries can offer or if you are running on Google App Engine standard environment, in which case the use of two APIs is suggested.

In fact, the most important factor for efficiency is using the gRPC API instead of the REST API (which libraries' calls do by default). As mentioned here:

There are two major factors at work here: more efficient data encoding and HTTP/2. gRPC keeps data in binary both in client memory and on the wire by building on HTTP/2 and Protocol Buffers. This eliminates processing and space required for string encoding schemes such as Base64 or JSON. In addition, HTTP/2 itself makes things go faster with multiplexed requests over a single connection and header compression.

I did not find data format explicit mentions anywhere. I suggest you to use your preferred language for the message, as for example Python. Client library description here and sample code here.

Based on this StackOverflow post, you can optimize your PubSub system efficienctly by:

  1. Making sure you are using gRPC
  2. Batching where possible, to reduce the number of calls and eliminate latency.
  3. Only compressing when needed and after benchmarking (implies extra logic in your application)

Finally, if you intend to deploy a robust PubSub system, have a look on this Anusha Ramesh post. She is Project Manager at Google now and suggests and elaborates on three tips:

  1. Don't underestimate the importance of capacity planning.
  2. Make sure your pub/sub system is fault-tolerant.
  3. NSM: Never Stop Monitoring.
like image 163
Rubén C. Avatar answered Sep 18 '25 11:09

Rubén C.