So I know that Javascript Maps have a set amount of keys that they can store ( around 16.7 M ).
I was trying to test if I can ( in a very ugly way ) remove the oldest elements from the array. I noticed that no matter what I do it is actually not the Map size that was a limiting factor but it was rather the amount of operations I have done that were limiting me.
Below is an example code:
const map = new Map();
let i = 0;
while (true) {
i++;
set(i, i);
if (i % 1000 === 0)
console.log('INSERTED: ', i, 'KEYS', 'MAP SIZE :', map.size);
}
function set(key, value) {
if (map.size > 16770000) {
Array.from(map.keys()).slice(0, 10000).forEach(key => map.delete(key));
console.log('DELETED, current map size:', map.size);
}
try {
map.set(key, value);
} catch (e) {
console.log('MAP SIZE:', map.size, 'INSERTED:', key);
throw e;
}
}
When you run the snippet, just check your console. What you should notice is at the end ( when the exception is thrown ) you will get the Map Size and the INSERTED. Map Size will be a variable ( depending on how many elements you remove, which in this case is 10000) but INSERTED will always be the same value. So how come if I am not reaching the limit of the Map.... I am somehow reaching a limit. Is this some kind of reference issue that I am missing?
EDIT: As mentioned by @CRice if you increase the items deleted to around 10,000,000 then the cycle continues on seemingly forever.
EDIT 2: Here is an answer from one of the V8 devs talking about the limit of 16.7M keys: https://stackoverflow.com/a/54466812/5507414
EDIT 3: See answer: https://stackoverflow.com/a/63234302/5507414. We still need a V8 developer or someone with further knowledge in the engine to clarify this.
You can't do this in JavaScript until you have a unique key in that particular object.
Map Keys. Maps accept any data type as a key, and do not allow duplicate key values.
Introduction to JavaScript Map object An object always has a default key like the prototype. A key of an object must be a string or a symbol, you cannot use an object as a key. An object does not have a property that represents the size of the map.
The map() method in JavaScript creates an array by calling a specific function on each element present in the parent array. It is a non-mutating method. Generally map() method is used to iterate over an array and calling function on every element of array.
I adapted your script (see below) to see how many items had to be deleted before it could insert keys again in the Map
.
The result is 8388608 (= 16777216/2) with node v12.18.1
(built on Chrome's V8 JavaScript engine).
It reminded me of a usual pattern where the underlying data structure doubles in size when it's almost full. So I looked for the actual Map implementation in the V8 engine.
Here's what V8 development blog says about it:
ECMAScript 2015 introduced several new data structures such as Map, Set, WeakSet, and WeakMap, all of which use hash tables under the hood.
And here's an interesting comment in V8 source code:
HashTable is a subclass of FixedArray that implements a hash table that uses open addressing and quadratic probing. In order for the quadratic probing to work, elements that have not yet been used and elements that have been deleted are distinguished. Probing continues when deleted elements are encountered and stops when unused elements are encountered. - Elements with key == undefined have not been used yet. - Elements with key == the_hole have been deleted.
Basically, when the script deletes a key, it seems that it's just marked as deleted. It becomes a "hole", as the V8 code comment puts it. It's actually deleted only when the engine actually rebuilds the underlying data structure (that's what happens when the script deletes half of the elements).
Anyway, that's my understanding. We would need to delve into V8 code in order to clarify all the details.
Other interesting references:
map = new Map();
let i = 0;
while (true) {
i++;
try {
map.set(i, i);
} catch (e) {
console.log(e);
break;
}
if (i % 100000 === 0)
console.log('inserted: ', i);
}
console.log('max map size:', map.size, 'inserted:', i);
let j = 0;
while (true) {
j++;
map.delete(j);
if (j % 100000 === 0) {
console.log('deleted: ', j, 'map size: ', map.size);
if (map.size == 0) {
break;
}
}
try {
map.set(i, i);
} catch(e) {
continue;
}
break;
}
console.log('deleted before inserting again: ', j);
I dug into the ECMA language spec to take a look at Maps (Link). It seems that the behavior you are seeing is consistent with spec, and comes out of the spec'd definition for Map's delete prototype.
When a Map element is deleted with Map.prototype.delete(key)
, the spec only requires that the element with the matching key
be set to empty.
Here's the definition copied and pasted from the ECMA spec:
3.1.3.3 Map.prototype.delete ( key )
The following steps are taken:
- Let M be the this value.
- Perform ? RequireInternalSlot(M, [[MapData]]).
- Let entries be the List that is M.[[MapData]].
- For each Record { [[Key]], [[Value]] } p that is an element of entries, do
a. If p.[[Key]] is not empty and SameValueZero(p.[[Key]], key) is true, then
i. Set p.[[Key]] to empty.
ii. Set p.[[Value]] to empty.
iii. Return true.- Return false.
The most important piece to us here is 4a.
When deleting an element, Map.prototype.delete
checks each record p for an element where p.[[Key]] matches the provided key argument.
When found, p.[[Key]] and p.[[Value]] are both set to empty.
This means that, while the key and value are gone and are no longer stored or retrievable, the space, the element itself where the key and value were stored, may indeed be left in the Map's storage, and still takes up space behind the scenes.
While the specification contains the following note about its use of "empty"...
The value empty is used as a specification device to indicate that an entry has been deleted. Actual implementations may take other actions such as physically removing the entry from internal data structures.
...it's still leaving the door open for implementations to simply wipe the data without reclaiming the space, which is apparently what is occurring in your example here.
In the case of set()
, the function checks first for an existing element with a matching key to mutate the value of, and skips over all empty elements in the process. If none is found, then "Append p [<key, value>] as the last element of entries".
In the case of size
, the spec loops over all elements in the Map, and simply increments a counter for all non-empty elements it encounters.
I found this really interesting... If I had to hazard a guess, I suppose that the overhead of finding and removing empty elements is seen as unnecessary in most cases, since the quantities that must be reached to fill the structure are so large, ie. since maps hold so much. I wonder how large the time and space overhead of removing an empty element would be for a dataset large enough for it to be needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With