string[] words = System.IO.File.ReadAllLines("word.txt");
var query = from word in words
where word.Length > "abe".Length && word.StartsWith("abe")
select word;
foreach (var w in query.AsParallel())
{
Console.WriteLine(w);
}
Basically the word.txt contains 170000 English words. Is there a collection class in C# that is faster than array of string for the above query? There will be no insert or delete, just search if a string starts with "abe" or "abdi".
Each word in the file is unique.
EDIT 1
This search will be performed potentially millions of times in my application. Also I want to stick with LINQ for collection query because I might need to use aggregate function.
EDIT 2
The words from the file are sorted already, the file will not change
A HashSet, similar to a Dictionary, is a hash-based collection, so look ups are very fast with O(1). But unlike a dictionary, it doesn't store key/value pairs; it only stores values.
If you need fast add and removal of elements, use LinkedList (but it has a very poor seeking performance).
There are no standard collections in C outside of the array.
Collections in C# are classified into two types - Generic collections and non-generic collections.
myself I'd create a Dictionary<char, List<string>>
, where I'd group words by their first letter. This will reduce substantially the lookup of needed word.
If you need to do search once there is nothing better than linear search - array is perfectly fine for it.
If you need to perform repeated searches you can consider soring the array (n Log n) and search by any prefix will be fast (long n). Depending on type of search using dictionary of string lists indexed by prefix may be another good option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With