Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing array iteration with nested Where (LINQ) clauses

I am creating a (C#) tool that has a search functionality. The search is kind of similar to a "go to anywhere" search (like ReSharper has or VS2013).

The search context is a string array that contains all items up front:

private string[] context; // contains thousands of elements

Searching is incremental and occurs with every new input (character) the user provides.

I have implemented the search using the LINQ Where extension method:

// User searched for "c"
var input = "c";
var results = context.Where(s => s.Contains(input));

When the user searches for "ca", I attempted to use the previous results as the search context, however this causes (i think?) a nested Where iteration, and does not run very well. Think of something like this code:

// Cache these results.
var results = var results = context.Where(s => s.Contains(input));

// Next search uses the previous search results
var newResults = results.Where(s => s.Contains(input));

Is there any way to optimize this scenario?

Converting the IEnumerable into an array with every search causes high memory allocations and runs poorly.

like image 892
lysergic-acid Avatar asked Apr 10 '26 10:04

lysergic-acid


2 Answers

Presenting the user with thousands of search results is pretty useless. You should add a "top" (Take in linq) statement to your query before presenting the result to the user.

var results = context.Where(s => s.Contains(input)).Take(100);

And if you want to present the next 100 results to the user:

var results = context.Where(s => s.Contains(input)).Skip(100).Take(100);

Also just use the original array for all the searches, no nested Where as it has no benefits unless you materialize the query.

like image 190
Magnus Avatar answered Apr 12 '26 23:04

Magnus


I got a couple of useful points to add, too many for a comment.

First off, i agree with the other comments that you should start with .take(100), decrease the load time. Even better, add one result at the time:

var results = context.Where(s => s.Contains(input));
var resultEnumerator = result.GetEnumerator()

Loop over the resultEnumerator to display results one at the time, stop when the screen is full or a new search is initiated.

Second, throttle your input. If the user writes Hello, you do not want to shoot off 5 searches for H, He, Hel, Hell and Hello, you want to search for just Hello. When the user later add world, it could be worthwhile to take your old result and add Hello world to the where clause.

results = results.Where(s => s.Contains(input));
resultEnumerator = result.GetEnumerator()

And of course, cancel the current in progress result when the user adds new text.

Using Rx, the throttle part is easy, you would get something like this:

var result = context.AsEnumerable();
var oldStr = "";
var resultEnumerator = result.GetEnumerator();
Observable.FromEventPattern(h => txt.TextChanged += h, h => txt.TextChanged -= h)
         .Select(s => txt.Text)
         .DistinctUntilChanged().Throttle(TimeSpan.FromMilliseconds(300))
         .Subscribe(s =>
         {
             if (s.Contains(oldStr))
                 result = result.Where(t => t.Contains(s));
             else
                 result = context.Where(t => t.Contains(s));
             resultEnumerator = result.GetEnumerator();
             oldStr = s;
             // and probably start iterating resultEnumerator again,
             // but perhaps not on this thread.
         });
like image 42
Dorus Avatar answered Apr 13 '26 00:04

Dorus



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!