I was looking for very generic, strict and platform independent serialization framework. And I discovered something called ASN.1.
It looks like something related to serialization, but I couldn't understand actually what it is. I read Wikipedia article and ITU article but still, it's hard to know.
I have many questions. Maybe I need some overall differential description of ASN.1.
What is ASN.1? I guess Wikipedia pretty much tells you what it is. To understand ASN.1, you must realize that ASN.1 separates two concerns: describing your data and describing what your data looks like in transmission.
The first part is describing your data. ASN.1 specifies an abstract syntax notation (thus the name ASN.1) to do this. For example, I can specify that a Coordinate is a complex value consisting of a sequence of two integers which must be between 0 and 100:
Coordinate ::= SEQUENCE {x INTEGER(0..100), y INTEGER(0..100) }
The next part is deciding how to encode this to bytes for transmission. ASN.1 specifies a few standard sets of encoding rules to do this. Different encoding rules have their own advantages. Most are binary, but one is text-based (XER encodes to XML). The encoding rules specify, at the bit level, how to represent values described using the above abstract description. Everyone following the standard (and having agreed on the encoding rules) gets the very same string of bits.
The PER encoding rules use the constraints in your abstract definition to provide more compact encodings. For example, if you know your integers range 0..100, you only need 7 bits to encode those values.
ASN.1 does not define a 32-bit integer or a 1-bit boolean. Actually, that is thinking about ASN.1 in the wrong way, because that is thinking about the byte representation of the values. Again, ASN.1 separates a description of your values (I have an integer that can be between 0 and 100) vs. the representation of your values (I can represent that value in 7 bits).
I am not aware of a reference implementation; I am not sure it makes sense to speak of one. My company sells a tool that generates C/C++/Java/C# data structures and code from the abstract syntax definition. There are some similar free tools; I don't know their quality.
How does ASN.1 compare to serialization frameworks? ASN.1 is not a serialization framework. That is, it does not say anything about how to take any kind of programming data structures or objects and encode them. It provides a way to abstractly describe data values and specifies rules to derive an encoding of those values. A common ASN.1 usage is to use code generators to generate programming data structures from the abstract description, along with encoding/decoding methods that follow the chosen encoding rules. Of course, one can do this entirely by hand also.
Advantages of ASN.1? The ability to use tools to generate code. Along with that, the flexibility to produce different encodings (e.g. XML, PER) from the same abstract syntax.
Disadvantages of ASN.1? Probably complexity, though I suspect a person could get a lot done with it, using tools, without having to digest all the complexity (e.g. you are likely to rely on tools to do the right thing vs. trying to digest the encoding rule specs.).
UPDATE: There is now a second text-based set of encoding rules. JER encodes to JSON.
It's a serialization standard defined by ISO.
Yes, although the smallest space a value will occupy is (afaik) 5 bits.
I don't know of a full one, although I'm not claiming to be all knowing.
Hard to answer in a neutral way, but as far as I've experienced mainly complexity, getting close to a full implementation is hard.
See 4. ASN.1 is fairly space efficient (protobuf may give it a run for its money) but also would seem fairly complex as compared to most other serialization methods. In the end, complexity often loses (as also often "pay for read specifications" do)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With