Say I have an initial array of objects:
var initialData = [ { 'ID': 1, 'FirstName': 'Sally' }, { 'ID': 2, 'FirstName': 'Jim' }, { 'ID': 3, 'FirstName': 'Bob' } ];
I then get new data (another array of objects):
var newData = [ { 'ID': 2, 'FirstName': 'Jim' }, { 'ID': 4, 'FirstName': 'Tom' }, { 'ID': 5, 'FirstName': 'George' } ];
I want to merge the new data into initial data. However, I don't want to overwrite any objects in the initial data array. I just want to add in objects that weren't already there.
I know the objects are duplicates based on their 'ID'
key.
I know I can do this by looping through the new data, checking to see if it exists in the initial data, and if not, pushing into initial data.
for ( var i = 0, l = newData.length; i < l; i++ ) { if ( ! key_exists( newData[i].key, initialData ) ) { // key_exists() is a function that uses .filter() to test. initialData.push( newData[i] ); } }
I'm concerned about performance, though. I know there are lots of new ES6 ways of manipulating arrays, so I'm hoping someone has a better idea.
What is the best way (best as in best performance) of merging the new data into the initial data, while ignoring duplicates in new data?
We can use the spread operator on arrays within an array literal( [] ) to merge them. Let's see it with an example. First, we will take two arrays, arr1 and arr2 . Then merge the arrays using the spread operator( ... ) within an array literal.
Use the Array. We can use the JavaScript array reduce method to combine objects in an array into one object. We have the arr array which we want to combine into one object. To do that, we call reduce with a callback that returns an object with obj spread into the returned object. And we add the item.
You can create a set of IDs from initialData
and this will make "check if ID is already in initial data" faster - O(1):
var initialData = [{ 'ID': 1, 'FirstName': 'Sally' }, { 'ID': 2, 'FirstName': 'Jim' }, { 'ID': 3, 'FirstName': 'Bob' } ]; var newData = [{ 'ID': 2, 'FirstName': 'Jim' }, { 'ID': 4, 'FirstName': 'Tom' }, { 'ID': 5, 'FirstName': 'George' } ]; var ids = new Set(initialData.map(d => d.ID)); var merged = [...initialData, ...newData.filter(d => !ids.has(d.ID))]; console.log(merged);
The final runtime of this approach is O(n + m).
If you want to be slightly more efficient, you can consider looping through newData
and pushing any new elements to the final result array manually (instead of using filter
and the spread operator).
Actually, if you are interested on performance, you could think on changing your initialData
structure to something like this:
var initialData = { "1": {'FirstName': 'Sally'}, "2": {'FirstName': 'Jim'}, "3": {'FirstName': 'Bob'} };
In other words, we use the IDs as the keys of an object, this will give you O(1)
on access the data, and O(1)
in the exists test. You can get this structure using the next approach with reduce():
var initialData = [ {'ID': 1, 'FirstName': 'Sally'}, {'ID': 2, 'FirstName': 'Jim'}, {'ID': 3, 'FirstName': 'Bob'} ]; let newInitialData = initialData.reduce((res, {ID, FirstName}) => { res[ID] = {FirstName : FirstName}; return res; }, {}); console.log(newInitialData);
Using this new structure, you can make a O(n)
algorithm to insert the new data that is not already there:
var initialData = { "1": {'FirstName': 'Sally'}, "2": {'FirstName': 'Jim'}, "3": {'FirstName': 'Bob'} }; var newData = [ {'ID': 2, 'FirstName': 'Jim'}, {'ID': 4, 'FirstName': 'Tom'}, {'ID': 5, 'FirstName': 'George'} ]; newData.forEach(({ID, FirstName}) => { initialData[ID] = initialData[ID] || {FirstName: FirstName}; }); console.log(initialData);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With