How to bufferize efficiently in nodeJS on events from a stream to bulk insert instead of unique insert per record received from the stream. Here's pseudo code I've got in mind:
// Open MongoDB connection
mystream.on('data', (record) => {
// bufferize data into an array
// if the buffer is full (1000 records)
// bulk insert into MongoDB and empty buffer
})
mystream.on('end', () => {
// close connection
})
Does this look realistic? Is there any possible optimization? Existing libraries facilitaties that?
Using NodeJS' stream library, this can be concisely and efficiently implemented as:
const stream = require('stream');
const util = require('util');
const mongo = require('mongo');
const streamSource; // A stream of objects from somewhere
// Establish DB connection
const client = new mongo.MongoClient("uri");
await client.connect();
// The specific collection to store our documents
const collection = client.db("my_db").collection("my_collection");
await util.promisify(stream.pipeline)(
streamSource,
stream.Writable({
objectMode: true,
highWaterMark: 1000,
writev: async (chunks, next) => {
try {
const documents = chunks.map(({chunk}) => chunk);
await collection.insertMany(docs, {ordered: false});
next();
}
catch( error ){
next( error );
}
}
})
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With