Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

javascript string compression with localStorage

Tags:

I am using localStorage in a project, and it will need to store lots of data, mostly of type int, bool and string. I know that javascript strings are unicode, but when stored in localStorage, do they stay unicode? If so, is there a way I could compress the string to use all of the data in a unicode byte, or should i just use base64 and have less compression? All of the data will be stored as one large string.

EDIT: Now that I think about it, base64 wouldn't do much compression at all, the data is already in base 64, a-zA-Z0-9 ;: is 65 characters.

like image 910
invisible bob Avatar asked Jul 28 '11 20:07

invisible bob


2 Answers

You could encode to Base64 and then implement a simple lossless compression algorithm, such as run-length encoding or Golomb encoding. This shouldn't be too hard to do and might give you a bit of ompression.

Golomb encoding

I also found JsZip. I guess you could check the code and only use the algorithm, if it is compatible.

Hope this helps.

http://jszip.stuartk.co.uk/

like image 24
Laurent Zuijdwijk Avatar answered Oct 02 '22 14:10

Laurent Zuijdwijk


"when stored in localStorage, do they stay unicode?"

The Web Storage working draft defines local storage values as DOMString. DOMStrings are defined as sequences of 16-bit units using the UTF-16 encoding. So yes, they stay Unicode.

is there a way I could compress the string to use all of the data in a unicode byte...?

"Base32k" encoding should give you 15 bits per character. A base32k-type encoding takes advantage of the full 16 bits in UTF-16 characters, but loses a bit to avoid tripping on double-word characters. If your original data is base64 encoded, it only uses 6 bits per character. Encoding those 6 bits into base32k should compress it to 6/15 = 40% of its original size. See http://lists.xml.org/archives/xml-dev/200307/msg00505.html and http://lists.xml.org/archives/xml-dev/200307/msg00507.html.

For even further reduction in size, you can decode your base64 strings into their full 8-bit binary, compress them with some known compression algorithm (e.g. see javascript implementation of gzip), and then base32k encode the compressed output.

like image 146
Oren Trutner Avatar answered Oct 02 '22 14:10

Oren Trutner