Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can my code improve from using LINQ?

I have this code, which works fine, but is slow on large datasets.

I'd like to hear from the experts if this code could benefit from using Linq, or another method, and if so, how?

  Dim array_of_strings As String()

  ' now I add strings to my array, these come from external file(s). 
  ' This does not take long

 ' Throughout the execution of my program, I need to validate millions
 ' of other strings.

  Dim search_string As String
  Dim indx As Integer

  ' So we get million of situation like this, where I need to find out
 ' where in the array I can find a duplicate of this exact string

  search_string = "the_string_search_for"

  indx = array_of_strings.ToList().IndexOf(search_string)

Each of the strings in my array are unique, no duplicates.

This works pretty well, but like I said, too slow for larger datasets. I am running this query millions of times. Currently it takes about 1 minute for a million queries but this is too slow to my liking.

like image 524
Yeahson Avatar asked Feb 24 '16 18:02

Yeahson


1 Answers

There's no need to use Linq. If you used an indexed data structure like a dictionary, the search would be O(log n), at the cost of a slightly longer process of filling the structure. But you do that once, then do a million searches, you're going to come out ahead.

See the description of Dictionary at this site: https://msdn.microsoft.com/en-us/library/7y3x785f(v=vs.110).aspx

Since (I think) you're talking about a collection that is its own key, you could save some memory by using SortedSet<T> https://msdn.microsoft.com/en-us/library/dd412070(v=vs.110).aspx

like image 109
Michael Blackburn Avatar answered Oct 12 '22 21:10

Michael Blackburn