Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lossless compression method to shorten string before base64 encoding to make it shorter?

just built a small webapp for previewing HTML-documents that generates URL:s containing the HTML (and all inline CSS and Javascript) in base64 encoded data. Problem is, the URL:s quickly get kinda long. What is the "de facto" standard way (preferably by Javascript) to compress the string first without data loss?

PS; I read about Huffman and Lempel-Ziv in school some time ago, and I remember really enjoying LZW :)

EDIT:

Solution found; seems like rawStr => utf8Str => lzwStr => base64Str is the way to go. I'm further working on implementing huffman compression between utf8 and lzw. Problem so far is that too many chars become very long when encoded to base64.

like image 339
bennedich Avatar asked Nov 10 '10 13:11

bennedich


People also ask

How do I make a base64 string shorter?

There is no "shorter version" of base64. But what you can do is retrieve only the first characters of the base64 result; with cut for instance. Using cut this way is also safe if the base64 result is shorter than 10 characters.

Can base64 be shortened?

AFAIK It's impossible to shrink a base64 code. Short answer, cannot be done/makes no sense.

Does base64 encoding reduce size of string?

Although Base64 is a relatively efficient way of encoding binary data it will, on average still increase the file size for more than 25%. This not only increases your bandwidth bill, but also increases the download time.

How do you make strings smaller?

Make a loop at the end of the string After cutting the string at the proper length, take the end of the string and tie a knot at the very end, then fold the string over and tie a loop, about the same size as the original loop (about 2cm in diameter).


2 Answers

Check out this answer. It mentions functions for LZW compression/decompression (via http://jsolait.net/, specifically http://jsolait.net/browser/trunk/jsolait/lib/codecs.js).

like image 179
David Murdoch Avatar answered Sep 20 '22 18:09

David Murdoch


You will struggle to get very much compression at all on a URL, they're too short and don't contain enough redundant information to get much benefit from Huffman / LZW style algorithms.

If you have constraints on the space of possible URLS (e.g. all content tends to be in the same set of folders) you could hard code some parts of the URLS for expansion on the client - i.e. cheat.

like image 33
James Gaunt Avatar answered Sep 20 '22 18:09

James Gaunt