Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter a large NSArray efficiently?

I'm having a performance problem with filtering a large NSArray (19k items) for interactive auto-completion on the iPhone.

Currently, whenever the user types a letter into a search box, I start filtering the array using an NSPredicate in a separate thread and display the results. Of course, the data set is to big for the iPhone to complete the filtering before the user tabs the second key, so no previews are shown, until the user stops typing for a second or two.

[Computer Science Babble, you can safely skip this part] I guess, what the framework is doing is to apply the NSPredicate to each and every item in the array, thus requiring O(n), where n is the number of array items. However, it should be possible to solve the problem more in O(log(n)), using a more efficient approach. I.e. sort the list once in O(n*log(n)) (this could be done at development time), look up where one would need to insert the search string in that list O(log(n)) and start iterating from there until an item does not start with the search string O(m). Resulting in a efficient O(log(n) + m), with m << n algorithm. An DAWG would be even better, but I can't remember seeing anything like that in the toolkit. [/Computer Science Babble]

I was wondering, if there is a built-in way to let the array know, that it is sorted by the very same field that the filter is testing against, and thus the filter could be applied efficiently on that sorted array.

Solution

I've created a very simple search index using a dictionary that maps individual characters to arrays of items whose key starts with that character. At least for my use-case this was enough optimization to acheive instantenous display auto-completion.

like image 476
Chris Avatar asked Dec 31 '25 05:12

Chris


1 Answers

If the data is sorted in some fashion, then I would suggest breaking the array into multiple smaller arrays. So you might have A-G, H-M, N-Z arrays.

Alternately stuff everything into a core data or SQLite database and use queries to help speed up things. Indexed database selects will be much more efficient than trying to filter data in memory when you are dealing with such a large dataset.

Another suggestion is to create a trie, which would make everything much better. Though they are a bit of work to create:

http://en.wikipedia.org/wiki/Trie

like image 162
logancautrell Avatar answered Jan 05 '26 05:01

logancautrell



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!