Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shorter encoding than Base64

I have this String that is encoded into Base64 String:

{
  "appId": "70cce8adb93c4c968a7b1483f2edf5c1",
  "apiKey": "a65d8f147fa741b0a6d7fc43e18363c9",
  "entityType": "Todo",
  "entityId": "2-0",
  "blobName": "picture"
}

The output is:

ewogICJhcHBJZCI6ICI3MGNjZThhZGI5M2M0Yzk2OGE3YjE0ODNmMmVkZjVjMSIsCiAgImFwaUtleSI6ICJhNjVkOGYxNDdmYTc0MWIwYTZkN2ZjNDNlMTgzNjNjOSIsCiAgImVudGl0eVR5cGUiOiAiVG9kbyIsCiAgImVudGl0eUlkIjogIjItMCIsCiAgImJsb2JOYW1lIjogInBpY3R1cmUiCn0=

In my case this is quite long. I can't use one way hashing in my case because it needs to be decoded on the other end.

Is there an encoding that is at least just ~1/4 the size compared to Base64 encoding?

like image 825
quarks Avatar asked Aug 17 '19 06:08

quarks


1 Answers

The Base64 encoding encodes binary data into characters in a 64 bit alphabet. That entails a size increase of 33.3%; i.e. 3 bytes becomes 4 characters.

Is there an encoding that is at least just ~1/4 the size compared to Base64 encoding?

A reduction to 1/4 of the size of the Base64 implies that the transmitted form must be smaller than the original form of the data. This can only be achieved if the original data is highly compressible. You need to do the following:

  1. Compress the original byte sequence by more than a factor of 4.
  2. Apply a binary to text encoding.

Given that the first step only works for compressible data and a lot of data formats (e.g. images, video, sound, ZIP files) are already compressed, the answer to your question in the general case is No.

For your specific example, I think that the answer is "probably no". That JSON string has a fair amount of redundancy in it, but I doubt that a general purpose compression algorithm could compress it by a factor of 4.

A better approach would be to design a compact binary representation:

  • Encode the id and key as bytes
  • Encode the names as ASCII or UTF-8 byte sequences + byte count.
  • Get rid of the attribute names
  • Get rid of the other JSON syntax overheads.

Then Base64 encode the binary representation.

like image 194
Stephen C Avatar answered Sep 21 '22 08:09

Stephen C