I'm planning of using kafka as a persistent log for event sourcing and I'm currently investigating different serialization options. My focus is currently on using thrift for the serialization and deserialization of messages that I will be storing in kafka.
When using thrift so serialize messages for kafka, the simplest approach appears to be to have a single thrift struct per kafka topic.
Question: Is this a good pattern to follow in practice? If not, can you please list the disadvantages of following this approach?
Note: If you think this question doesn't meet stackoverflow standards, please help me improve it!
Thrift structs do not carry with them any indicator of the type of struct (at least, not in the default binary protocol). Thus, to deserialize a tree of Thrift data, you need to know the type of struct at the root. Thus your idea of one struct type per topic is sensible.
One thing that could be useful in this case though is a Thrift union... you can define a single union that contains fields for all of the different types you'd like be able to publish on the topic, and the consumer can just serialize the union type and figure out which field is set. There would be very little overhead to this approach, as Thrift unions are optimized for this use case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With