Summary: Is there a faster way to hash objects than JSON.stringify
?
Details: I have a Ruby and JavaScript library (NeatJSON) that provides pretty-printing of JavaScript values. I recently fixed a problem where deeply-nested objects caused O(n!) performance (n being the nesting level) using memoization based on the object being serialized and the indentation amount.
In Ruby, the fix was really easy, because you can index hashes by arrays of unique sets of objects:
build = ->(object,indent) do
memoizer[[object,indent]] ||= <all the rest of the code>
end
In JavaScript, however, I can't index an object by another object (in a unique way). Following the lead of several articles I found online, I decide to fix the problem generically, using JSON.stringify
on the full set of arguments to the function to create a unique key for memoization:
function memoize(f){
var memo = {};
var slice = Array.prototype.slice;
return function(){
var args = slice.call(arguments);
var mkey = JSON.stringify(args);
if (!(mkey in memo)) memo[mkey] = f.apply(this,args);
return memo[mkey];
}
}
function rawBuild(o,indent){ .. }
var build = memoize(rawBuild);
This works, but (a) it's a little slower than I'd like, and (b) it seems wildly inefficient (and inelegant) to perform (naive) serialization of every object and value that I'm about to serialize smartly. The act of serializing a large object with many values is going to store a string and formatting result for EVERY unique value (not just leaf values) in the entire object.
Is there a modern JavaScript trick that would let me uniquely identify a value? For example, some way of accessing an internal ID, or otherwise associating complex objects with unique integers that takes O(1) time to find the identifier for a value?
Memoization is actually a specific type of caching. The term caching can generally refer to any storing technique (like HTTP caching) for future use, but memoizing refers more specifically to caching function that returns the value.
Memoization: Memoization is a technique for speeding up applications by caching the results of expensive function calls and returning them when the same inputs are used again.
Memoization Using Memo 1 2 3 const MyComponent = React. memo(function MyComponent(props) { /* render using props */ }); We can make a custom comparison function and pass it as a second argument to memo to control its default comparison behavior, which shallowly compares complex objects in the props object.
In programming, memoization is an optimization technique that makes applications more efficient and hence faster. It does this by storing computation results in cache, and retrieving that same information from the cache the next time it's needed instead of computing it again.
Using @Bergi's suggestion of a WeakMap
I found out about Map
, which allows using any value type as the key (not just objects). Because I needed a compound key—uniquely memoizing the combination of the value passed in and the indentation string—I created a hierarchical memoization structure:
function memoizedBuild(){
var memo = new Map;
return function(value,indent){
var byIndent=memo.get(value);
if (!byIndent) memo.set(value,byIndent={});
if (!byIndent[indent]) byIndent[indent] = rawBuild(value,indent);
return byIndent[indent];
}
}
This proved to be about 4× faster than the memoization code I had been using when serializing a large 270kB JSON object.
Note that in the above code I'm able to use !byIndent[indent]
only because I know that rawBuild
will never return a falsey value (null
, undefined
, false
, NaN
, 0
, ""
). The safer code line would look something like:
if (!(indent in byIndent)) byIndent[indent] = rawBuild(value,indent);
If you are looking to memoise your objects by identity (not by content), then you'll want to use a WeakMap
which is designed for exactly this purpose. They don't work for primitive values though, so you'll need a different solution for such arguments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With