Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to create a list of unique strings from within a loop?

I have a set of strings (~80 000) I can only access sequentially by the hits.Doc(int).Get("fieldName") method.

List<string> idStrings = new List<string>();
int count = hits.Length();
for (int i = 0; i < count; i++)
{
    string idString = hits.Doc(i).Get("id");
    if (!idStrings.Contains(idString))
        idStrings.Add(idString);
}

The strings will later on have to be int.TryParse()'d. I think there should be a faster way to do this. Any suggestions?

like image 694
Boris Callens Avatar asked Mar 25 '09 10:03

Boris Callens


2 Answers

First of all, use a Hashset<string> instead of a list, the Contains method will go faster:

int count = hits.Length();
Hashset<string> idStrings = new Hashset<string>();

EDIT: You don't have to call "Contains" if you use a Hashset as it can't contain duplicate items. Just use Add, it will automatically remove duplicate values.

like image 125
ybo Avatar answered Oct 29 '22 13:10

ybo


Use a Dictionary instead of a List. The Dictionary.ContainsKey method is much faster than the List.Contains method.

Dictionary<string, int> idStrings = new Dictionary<string, int>();
int count = hits.Length();
for (int i = 0; i < count; i++) {
   string idString = hits.Doc(i).Get("id");
   if (!idStrings.ContainsKey(idString)) {
      idStrings.Add(idString, 1);
   }
}

If you use framework 3.5 you can use a HashSet instead of a Dictionary:

HashSet<string> idStrings = new HashSet<string>();
int count = hits.Length();
for (int i = 0; i < count; i++) {
   string idString = hits.Doc(i).Get("id");
   idStrings.Add(idString);
}
like image 22
Guffa Avatar answered Oct 29 '22 12:10

Guffa