Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search a big array for an object?

I had an interview today, I was asked how search for a number inside an array, I said binarysearch, he asked me how about a big array that has thousands of bjects (for example Stocks) searching for example by price of the stocks, I said binarysearch again, he said sorting an array of thousands will take lot of time before applying binarysearch.

Can you please bear with me and teach me how to approach this problem ? thanks your help is appreciated.

like image 939
Majid Kamal Avatar asked Mar 23 '12 00:03

Majid Kamal


People also ask

Which search is suitable for large array?

Linear search. This will take O(n) . It should be good enough if the array size is in the range of thousands and millions.

How do you find the object of an array of objects?

Answer: Use the find() Method You can simply use the find() method to find an object by a property value in an array of objects in JavaScript. The find() method returns the first element in the given array that satisfies the provided testing function. If no values satisfy the testing function, undefined is returned.

How do you search within an array?

Use filter if you want to find all items in an array that meet a specific condition. Use find if you want to check if that at least one item meets a specific condition. Use includes if you want to check if an array contains a particular value. Use indexOf if you want to find the index of a particular item in an array.

What is a big array?

An array of different things or people is a large number or wide range of them. [...]


2 Answers

I was asked a similar question.The twist was to search in sorted and then an unsorted array .These were my answers all unaccepted

  1. For sorted I suggested we can find the center and do a linear search .Binary search will also work here
  2. For unsorted I suggested linear again .
  3. Then I suggested Binary which is kind of wrong.
  4. Suggested storing the array in a hashset and utilize hashing . (Not accepted since high space complexcity)
  5. I suggested Tree Set which is a Red Black tree quite good for lookup.(Not accepted since high space complexcity)
  6. Copying into Arraylist etch were also considered overhead.

In the end I got a negative feedback. Though we may think that one of the above is solution but surely there is something special in linear searching which I am missing.

To be noted sorting before searching is also an overhead especially if you are utilizing any extra data structures in between.

Any comments welcomed.

like image 168
user666 Avatar answered Sep 30 '22 14:09

user666


I am not sure what he had in mind.

If you just want to find the number one time, and you have no guarantees about whether the array is sorted, then I don't think you can beat linear search. On average you will need to seek halfway through the array before you find the value, i.e. expected running time O(N); when sorting you have to touch every single value at least once and probably more than that, i.e. expected running time O(N log N).

But if you need to find multiple values then the time spent sorting it pays off quickly. With a sorted array, you can binary search in O(log N) time, so for sure by the third search you are ahead if you invested the time to sort.

You can do even better if you are allowed to build different data structures to help with the problem. You could build some sort of index, such as a hash table; but the champion data structure for this sort of problem probably would be some sort of tree structure. Then you can insert new values into the tree faster than you could append new values and re-sort the array, and the lookup is still going to be O(log N) to find any value. There are different sorts of trees available: binary tree, B-tree, trie, etc.

But as @Hot Licks said, a hash table is often used for this sort of thing, and it's pretty cheap to update: you just append a value on the main array, and update the hash table to point to the new value. And a hash table is very close to O(1) time, which you can't beat. (A hash table is O(1) if there are no hash collisions; assuming a good hash algorithm and a big enough hash table there will be almost no collisions. I think you could say that a hash table is O(N) where N is the average number of hash collisions per "bucket". If I'm wrong about that I expect to be corrected very quickly; this is StackOverflow!)

like image 22
steveha Avatar answered Sep 30 '22 12:09

steveha