I see Thrift and Protocol Buffers mentioned a lot, but I don't really understand what they're used for. From my limited understanding, they're basically used when you want to do cross-language serialization, i.e., when you have some data structures in one language that you want to send off to another program written in another language.
Is this correct? Are they used for anything else?
(From my again limited understanding, I think Thrift and Protocol Buffers are basically two different versions of the same thing -- feel free to correct me or elaborate.)
Thrift is an interface definition language and binary communication protocol used for defining and creating services for numerous programming languages.
Protocol buffers provide a language-neutral, platform-neutral, extensible mechanism for serializing structured data in a forward-compatible and backward-compatible way. It's like JSON, except it's smaller and faster, and it generates native language bindings.
They both offer many of the same features; however, there are some differences: Thrift supports 'exceptions' Protocol Buffers have much better documentation/examples. Thrift has a builtin Set type.
Protocol Buffer, a.k.a. Protobuf Protobuf is the most commonly used IDL (Interface Definition Language) for gRPC. It's where you basically store your data and function contracts in the form of a proto file.
They are serialization protocols, primarily. Any time you need to transfer data between machines or processes, or store it on disk etc, it needs to be serialized.
Xml / json / etc work ok, but they have certain overheads that make them undesirable - in addition to limited features, they are relatively large, and computationally expensive to process in either direction. Size can be improved by compression, but that adds yet more to the processing cost. They do have the advantage of being human-readable, but: most data is not read by humans.
Now people could spend ages manually writing tedious, bug-ridden, sub-optimal, non-portable formats that are less verbose, or they can use well-tested general-purpose serialization formats that are well-documented, cross-platform, cheap-to-process, and designed by people who spend far too long worrying about serialization in order to be friendly - for example, version tolerant. Ideally, it would also allow a platform-neutral description layer (think "wsdl" or "mex") that allows you to easily say "here's what the data looks like" to any other dev (without knowing what tools/language/platform they are using), and have them consume the data painlessly without writing a new serializer/deserializer from scratch.
That is where protobuf and thrift come in.
In most cases volume-wise, I would actually expect both ends to be in the same technology in the same company: simply, they need to get data from A to B with the minimum of fuss and overhead, or they need to store it and load it back later (for example, we use protobuf inside redis blobs as a secondary cache).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With