Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to store large arrays of integers in localStorage with Javascript

*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. Storing times are not as important.

I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string.

Obviously, the simplest method is to just JSON.stringify it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. An average size for one of these arrays is then ~5500 bytes.

Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end)

  • zero-padding the numbers so each was 2 characters long, eg:

    [5, 27, 7, 38] ==> "05270738"
    
  • base 50 encoding it:

    [5, 11, 7, 38] ==> "5b7C"
    
  • just using the value as a character code (adding 32 to avoid the weird control characters at the start):

    [5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...)
    

Here are my results:

                  size     Chrome 18   Firefox 11
-------------------------------------------------
JSON.stringify    5286          60ms         99ms
zero-padded       3600         354ms        703ms
base 50           1800         315ms        400ms
charCodes         1800          21ms        178ms

My question is if there is an even better method I haven't yet considered?

Update
MДΓΓБДLL suggested using compression on the data. Combining this LZW implementation with the base 50 and charCode data. I also tested aroth's code (packing 4 integers into 3 bytes). I got these results:

                  size     Chrome 18   Firefox 11
-------------------------------------------------
LZW base 50       1103         494ms        999ms
LZW charCodes     1103         194ms        882ms
bitpacking        1350        2395ms        331ms
like image 325
nickf Avatar asked Apr 12 '12 00:04

nickf


1 Answers

If your range is 0-50, then you can pack 4 numbers into 3 bytes (6 bits per number). This would allow you to store 1800 numbers using ~1350 bytes. This code should do it:

window._firstChar = 48;

window.decodeArray = function(encodedText) {
    var result = [];
    var temp = [];

    for (var index = 0; index < encodedText.length; index += 3) {
        //skipping bounds checking because the encoded text is assumed to be valid
        var firstChar = encodedText.charAt(index).charCodeAt() - _firstChar;
        var secondChar = encodedText.charAt(index + 1).charCodeAt() - _firstChar;
        var thirdChar = encodedText.charAt(index + 2).charCodeAt() - _firstChar;

        temp.push((firstChar >> 2) & 0x3F);    //6 bits, 'a'
        temp.push(((firstChar & 0x03) << 4) | ((secondChar >> 4) & 0xF));  //2 bits + 4 bits, 'b'
        temp.push(((secondChar & 0x0F) << 2) | ((thirdChar >> 6) & 0x3));  //4 bits + 2 bits, 'c'
        temp.push(thirdChar & 0x3F);  //6 bits, 'd'

    }

    //filter out 'padding' numbers, if present; this is an extremely inefficient way to do it
    for (var index = 0; index < temp.length; index++) {
        if(temp[index] != 63) {
            result.push(temp[index]);
        }            
    }

    return result;
};

window.encodeArray = function(array) {
    var encodedData = [];

    for (var index = 0; index < dataSet.length; index += 4) {
        var num1 = dataSet[index];
        var num2 = index + 1 < dataSet.length ? dataSet[index + 1] : 63;
        var num3 = index + 2 < dataSet.length ? dataSet[index + 2] : 63;
        var num4 = index + 3 < dataSet.length ? dataSet[index + 3] : 63;

        encodeSet(num1, num2, num3, num4, encodedData);
    }

    return encodedData;
};

window.encodeSet = function(a, b, c, d, outArray) {
    //we can encode 4 numbers in 3 bytes
    var firstChar = ((a & 0x3F) << 2) | ((b >> 4) & 0x03);   //6 bits for 'a', 2 from 'b'
    var secondChar = ((b & 0x0F) << 4) | ((c >> 2) & 0x0F);  //remaining 4 bits from 'b', 4 from 'c'
    var thirdChar = ((c & 0x03) << 6) | (d & 0x3F);          //remaining 2 bits from 'c', 6 bits for 'd'

    //add _firstChar so that all values map to a printable character
    outArray.push(String.fromCharCode(firstChar + _firstChar));
    outArray.push(String.fromCharCode(secondChar + _firstChar));
    outArray.push(String.fromCharCode(thirdChar + _firstChar));
};

Here's a quick example: http://jsfiddle.net/NWyBx/1

Note that storage size can likely be further reduced by applying gzip compression to the resulting string.

Alternately, if the ordering of your numbers is not significant, then you can simply do a bucket-sort using 51 buckets (assuming 0-50 includes both 0 and 50 as valid numbers) and store the counts for each bucket instead of the numbers themselves. That would likely give you better compression and efficiency than any other approach.

like image 149
aroth Avatar answered Sep 19 '22 16:09

aroth