Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An efficient Javascript set structure

After reading many similar questions:

  • JavaScript implementation of a set data structure
  • Mimicking sets in JavaScript?
  • Node JS, traditional data structures? (such as Set, etc), anything like Java.util for node?
  • Efficient Javascript Array Lookup
  • Best way to find if an item is in a JavaScript array?
  • How do I check if an array includes an object in JavaScript?

I still have a question: suppose I have a large array of strings (several thousands), and I have to make many lookups (i.e. check many times whether a given string is contained in this array). What is the most efficient way to do this in Node.js ?

A. Sort the array of strings, then use binary search? or:

B. Convert the strings to keys of an object, then use the "in" operator

?

I know that the complexity of A is O(log N), where N is the number of strings.

But I don't know the complexity of B.

IF a Javascript object is implemented as a hash table, then the complexity of B is, on average, O(1), which is better than A. However, I don't know if a Javascript object is really implemented as a hash table!

like image 870
Erel Segal-Halevi Avatar asked Jun 12 '13 20:06

Erel Segal-Halevi


People also ask

Is there a Set data structure in JavaScript?

Set is a data structure that was first introduced in ECMAScript 6. A Set holds a collection of values much like Arrays but it can only contain unique elements; Each value in a Sets occurs only once. Whether you're new to JavaScript or a seasoned programmer, the Set data structure can come in handy in many scenarios.

Which is faster array or Set?

In a direct comparison, Sets have several advantages over arrays, especially when it comes to a faster run-time: Search for an Item: Using indexOf() or includes() to check whether an item exists in an array is slow. Deleting an Item: In a Set, you can delete an item by its value.

What is {} and [] in JavaScript?

{} is shorthand for creating an empty object. You can consider this as the base for other object types. Object provides the last link in the prototype chain that can be used by all other objects, such as an Array . [] is shorthand for creating an empty array.


1 Answers

Update for 2016

Since you're asking about node.js and it is 2016, you can now use either the Set or Map object from ES6 as these are built into ES6. Both allows you to use any string as a key. The Set object is appropriate when you just want to see if the key exists as in:

if (mySet.has(someString)) {
    //code here
}

And, Map is appropriate when you want to store a value for that key as in:

if (myMap.has(someString)) {
    let val = myMap[someString];
    // do something with val here
}

Both ES6 features are now built into node.js as of node V4 (the current version of node.js as of this edit is v6).

See this performance comparison to see how much faster the Set operations are than many other choices.

Older Answer

All important performance questions should be tested with actual performance tests in a tool like jsperf.com. In your case, a javascript object uses a hash-table like implementation because without something that performs pretty well, the whole implementation would be slow since so much of javascript uses object.

String keys on an object would be the first thing I'd test and would be my guess for the best performer. Since the internals of an object are implemented in native code, I'd expect this to be faster than your own hashtable or binary search implemented in javascript.

But, as I started my answer with, you should really test your specific circumstance with the number and length of strings you are most concerned about in a tool like jsperf.

like image 162
jfriend00 Avatar answered Oct 02 '22 02:10

jfriend00