Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialize fixed size Map to CBOR

I have the following JSON:

[
  {
    2: {
      "c": true
    }
  },
  {
    3: {
      "p": 10
    }
  }
]

That I would like to convert to CBOR format. Accordingly to cbor.me I have the following output:

82A102A16163F5A103A161700A

But, when using Jackson Binary CBOR Serializer, I have the following output:

82BF02BF6163F5FFFFBF03BF61700AFFFF

Which is not wrong, but not optimized... I have an extra 4 unnecessary bytes added to what it can really be.

I've then tried to manually serialize the JSON but same result:

@Override
public void serialize(Request value, JsonGenerator jgen, SerializerProvider provider)
        throws IOException, JsonProcessingException {
    jgen.writeStartArray(value.getDataList().size());
    for (Data data : value.getDataList()) {
        jgen.writeStartObject(new Map[1]);
        jgen.writeFieldId(data.getItem());
        jgen.writeStartObject();
        if (data.getObject().getC() != null) {
            jgen.writeBooleanField("c", data.getObject().getC());
        }
        if (data.getObject().getP() != null) {
            jgen.writeNumberField("p", data.getObject().getP());
        }
        jgen.writeEndObject();
        jgen.writeEndObject();
    }
    jgen.writeEndArray();
}

Is this a bug with Jackson Binary format library or am I missing some configuration properties from the ObjectMapper?

EDIT: This seems to be a known issue: https://github.com/FasterXML/jackson-dataformats-binary/issues/3

like image 869
Bibu Avatar asked Nov 07 '22 10:11

Bibu


1 Answers

You've already received the answer in using a newer or better encoder. But for anyone else that comes here later...

The issue is OP's encoder was using indefinite length maps, then "BREAK" primitives to break out and go to the next item.

Compare the version with break primitives:

82             # array(2)
   BF          # map(*)
      02       # unsigned(2)
      BF       # map(*)
         61    # text(1)
            63 # "c"
         F5    # primitive(21)
         FF    # primitive(*)
      FF       # primitive(*)
   BF          # map(*)
      03       # unsigned(3)
      BF       # map(*)
         61    # text(1)
            70 # "p"
         0A    # unsigned(10)
         FF    # primitive(*)
      FF       # primitive(*)

To the version without them:

82             # array(2)
   A1          # map(1)
      02       # unsigned(2)
      A1       # map(1)
         61    # text(1)
            63 # "c"
         F5    # primitive(21)
   A1          # map(1)
      03       # unsigned(3)
      A1       # map(1)
         61    # text(1)
            70 # "p"
         0A    # unsigned(10)

Do you see the map(*) vs map(1)?

By using maps with specific lengths instead of indefinite length, the resulting CBOR can use "One map coming, here it is" instead of "IDK! Maps coming! Here is one! Now Stop!"

In the second example, there is still a primitive, but it's not a BREAK command. The 0xF5 effectively means "true". Take the first three bits (CBOR Major Type) away from 0xF5 (11110101) and you have decimal 21 the established CBOR "true" (0x00010101).

Also, it's entirely valid to assign the value 2 as the name of a map with "c"="true" inside of it. But beware that converting to JSON when using values as names will be problematic if that is something you are concerned about.

This was an issue with a poor encoder that should not have been using indefinite length maps/breaks. There is a time to use those, but only in a "streaming" mode which is unlikely for the example given. If you have all the items up front and encode, using indefinites is not required. If you have some number of maps but aren't sure how many and want to get started on encoding what you do have, that's when you'll want indefinite length maps or strings.

like image 153
mint branch conditioner Avatar answered Nov 15 '22 06:11

mint branch conditioner