Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate and remove elements from hashset most effective way

Ok here what i came up with but i wonder is it most effective way. I need to do this for ram memory issues.

HashSet<string> hsLinks = new HashSet<string>();
List<string> lstSortList = new List<string>();

// fill hashset with millions of records

while (true)
{
    string srLastitem = "";
    foreach (var item in hsLinks)
    {
        srLastitem = item;
        break;
    }
    lstSortList.Add(srLastitem);
    hsLinks.Remove(srLastitem);
    if (hsLinks.Count == 0)
        break;
}

c# .net 4.5.2 wpf application

like image 844
MonsterMMORPG Avatar asked Mar 17 '23 17:03

MonsterMMORPG


1 Answers

It seems you're trying to move items from the HashSet to the List. If that's the case simply move everything once with List.AddRange and use HashSet.Clear to empty the HashSet:

lstSortList.AddRange(hsLinks);
hsLinks.Clear();

If (as Vajura suggested) you're worried about holding on to 2 copies of the references* you can instead move batches instead of single items:

const int batchSize = 1000;
var batch = new string[batchSize];
do
{
    var batchIndex = 0;
    foreach (var link in hsLinks.Take(batchSize))
    {
        batch[batchIndex] = link;
        batchIndex++;
    }

    if (batchIndex < batchSize)
    {
        batch = batch.Take(batchIndex).ToArray();
    }

    hsLinks.ExceptWith(batch);
    lstSortList.AddRange(batch);
} while (hsLinks.Any());

Use batches in an appropriate size for you memory concerns.


*Note: A reference is 4 or 8 bytes in size (on 32bit and 64bit respectively). When you add the strings (which are reference types in .Net) to the list you are not copying them, only the references (which are mostly negligible).

like image 189
i3arnon Avatar answered Mar 19 '23 05:03

i3arnon