I got stuck with the following problem.
For example, I have some collection of items
List<int> exampleList = new List<int> { 1, 3, 5, 6, 7, 8, 6, 5, 6, 6 };
And some other collection of items that is subgroup of first one
List<int> customSelection = new List<int> { 1, 5, 6, 6, 8 };
What I want is to obtain difference between them, e.g. to get a collection containing items{ 3, 7, 5, 6, 6 }or in other words someIEnumerable<int> resultingCollectionthat will makecustomSelection.Concat(resultingCollection)to be equivalent toexampleList(not looking at items order).
I can't use.Except()extension method because it will exclude all items from first collection that are present in second one and that is not what I'm looking for. The only solution I came with is to do the following
// count item occurances in first collection
var countedItemsInFisrt = exampleList.GroupBy(item => item)
.ToDictionary(group => group.Key, group => group.Count());
// count item occurances in second collection
var countedItemsInSecond = customSelection.GroupBy(item => item)
.ToDictionary(group => group.Key, group => group.Count());
List<int> resultingCollection = new List<int>();
int itemsCountDifference;
int itemsCountInSecond;
foreach (var kvp in countedItemsInFisrt)
{
// when item count in first collection is grater then in second one we add it to resulting collection
// "count difference" times
if (!countedItemsInSecond.TryGetValue(kvp.Key, out itemsCountInSecond))
itemsCountInSecond = 0;
itemsCountDifference = kvp.Value - itemsCountInSecond;
for (int i = 0; i < itemsCountDifference; i++)
resultingCollection.Add(kvp.Key);
}
var stringResult = resultingCollection.Select(items => items.ToString());
Console.WriteLine(stringResult.Aggregate((a, b) => a + "," + b));
And this is just huge bunch of code to perform a selection. And even more I'm worried about performance becase in real case both collections can have thouthands of items.
Can this be done in some better way? Maybe I am missing something about LINQ that can help in my case?
EDIT:
The best solution for now is the last algorithm suggested by Ulugbek Umirov. It preserves order in the original collection and it also is significantly faster by 2.5 times than any other algorithm suggested when we have selection of 1/2 of original collection and even faster when selection is lesser. Great thanks to Ulugbek Umirov! I've made it into generic extension method that works with any generic collection:
public static IEnumerable<T> Subtract<T>(this IEnumerable<T> minuend, IEnumerable<T> subtrahend)
{
var diffList = new List<T>(minuend.Count() - subtrahend.Count());
var diffDict = subtrahend.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
minuend.ForeEach(n =>
{
int count = 0;
if (diffDict.TryGetValue(n, out count))
{
if (count == 1)
diffDict.Remove(n);
else
diffDict[n] = count - 1;
}
else
diffList.Add(n);
});
return diffList;
}
I wouldn't group second list.
List<int> exampleList = new List<int> { 1, 3, 5, 6, 7, 8, 6, 5, 6, 6 };
List<int> customSelection = new List<int> { 1, 5, 6, 6, 8 };
var diffDic = exampleList.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
customSelection.ForEach(n =>
{
if (diffDic.ContainsKey(n))
diffDic[n]--;
});
var diffList = diffDic.Where(p => p.Value > 0)
.SelectMany(p => Enumerable.Repeat(p.Key, p.Value))
.ToList();
Also the following piece of code may improve the performance:
customSelection.ForEach(n =>
{
int count = 0;
if (diffDic.TryGetValue(n, out count))
{
if (count == 1)
diffDic.Remove(n);
else
diffDic[n] = count - 1;
}
});
UPDATE
If you want to preserve the original order of items, you can use the following code:
List<int> exampleList = new List<int> { 1, 3, 5, 6, 7, 8, 6, 5, 6, 6 };
List<int> customSelection = new List<int> { 1, 5, 6, 6, 8 };
var diffList = new List<int>(exampleList.Count);
var customSelectionDic = customSelection.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
exampleList.ForEach(n =>
{
int count = 0;
if (customSelectionDic.TryGetValue(n, out count))
{
if (count == 1)
customSelectionDic.Remove(n);
else
customSelectionDic[n] = count - 1;
}
else
diffList.Add(n);
});
// diffList: { 3, 7, 5, 6, 6 }
This will not be the fastest, and will change the original list, but I think this is the shortest way:
customSelection.ForEach(x => exampleList.Remove(x));
Now exampleList will contain 3,7,5,6,6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With