I want to add objects into some table in IndexedDB in one transaction:
_that.bulkSet = function(data, key) {
var transaction = _db.transaction([_tblName], "readwrite"),
store = transaction.objectStore(_tblName),
ii = 0;
_bulkKWVals.push(data);
_bulkKWKeys.push(key);
if (_bulkKWVals.length == 3000) {
insertNext();
}
function insertNext() {
if (ii < _bulkKWVals.length) {
store.add(_bulkKWVals[ii], _bulkKWKeys[ii]).onsuccess = insertNext;
++ii;
} else {
console.log(_bulkKWVals.length);
}
}
};
Looks like that it works fine, but it is not very optimized way of doing that especially if the number of objects is very high (~50.000-500.000). How could I possibly optimize it? Ideally I want to add first 3000, then remove it from the array, then add another 3000, namely in chunks. Any ideas?
Inserting that many rows consecutively, is not possible to get good performance.
I'm an IndexedDB dev and have real-world experience with IndexedDB at the scale you're talking about (writing hundreds of thousands of rows consecutively). It ain't too pretty.
In my opinion, IDB is not suitable for use when a large amount of data has to be written consecutively. If I were to architect an IndexedDB app that needed lots of data, I would figure out a way to seed it slowly over time.
The issue is writes, and the problem as I see it is that the slowness of writes, combined with their i/o intensive nature, makes gets worse over time. (Reads are always lightening fast in IDB, for what it's worth.)
To start, you'll get savings from re-using transactions. Because of that your first instinct might be to try to cram everything into the same transaction. But from what I've found in Chrome, for example, is that the browser doesn't seem to like long-running writes, perhaps because of some mechanism meant to throttle misbehaving tabs.
I'm not sure what kind of performance you're seeing, but average numbers might fool you depending on the size of your test. The limiting faster is throughput, but if you're trying to insert large amounts of data consecutively pay attention to writes over time specifically.
I happen to be working on a demo with several hundred thousand rows at my disposal, and have stats. With my visualization disabled, running pure dash on IDB, here's what I see right now in Chrome 32 on a single object store with a single non-unique index with an auto-incrementing primary key.
A much, much smaller 27k row dataset, I saw 60-70 entries/second:
* ~30 seconds: 921 entries/second on average (there's always a great burst of inserts at the start), 62/second at the moment I sampled
* ~60 seconds: 389/second average (sustained decreases starting to outweigh effect initial burst) 71/second at moment
* ~1:30: 258/second, 67/second at moment
* ~2:00 (~1/3 done): 188/second on average, 66/second at moment
Some examples with a much smaller dataset show far better performance, but similar characteristics. Ditto much larger datasets - the effects are greatly exaggerated and I've seen as little as <1 entries per second when leaving for multiple hours.
IndexedDB is actually designed to optimize for bulk operations. The problem is that the spec and certain docs does not advertice the way it works. If paying certain attention to the parts in the IndexedDB specification that defines how all the mutating operations in IDBObjectStore works (add(), put(), delete()), you'll find out that it allow callers to call them synchronously and omit listening to the success events but the last one. By omitting doing that (but still listen to onerror), you will get enormous performance gains.
This example using Dexie.js shows the possible bulk speed as it inserts 10,000 rows in 680 ms on my macbook pro (using Opera/Chromium).
Accomplished by the Table.bulkPut() method in the Dexie.js library:
db.objects.bulkPut(arrayOfObjects)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With