I am writing a small program to compare two List. If the values are the same, I add them to list dups, if they are different, I add them to distinct. I noticed that some of my values are added and some are not, and after debugging for awhile, I am not certain what the problem is. Can someone shed a little light? Thanks.
List<int> groupA = new List<int>();
List<int> groupB = new List<int>();
List<int> dups = new List<int>();
List<int> distinct = new List<int>();
groupA.Add(2);
groupA.Add(24);
groupA.Add(5);
groupA.Add(72);
groupA.Add(276);
groupA.Add(42);
groupA.Add(92);
groupA.Add(95);
groupA.Add(266);
groupA.Add(42);
groupA.Add(92);
groupB.Add(5);
groupB.Add(42);
groupB.Add(95);
groupA.Sort();
groupB.Sort();
for (int a = 0; a < groupA.Count; a++)
{
for (int b = 0; b < groupB.Count; b++)
{
groupA[a].CompareTo(groupB[b]);
if (groupA[a] == groupB[b])
{
dups.Add(groupA[a]);
groupA.Remove(groupA[a]);
groupB.Remove(groupB[b]);
}
}
distinct.Add(groupA[a]);
}
You need to find the missing elements in both:
List<int> onlyInA = groupA.Except(groupB).ToList();
List<int> onlyInB = groupB.Except(groupA).ToList();
Or in a single linq:
List<int> missing = groupA.Except(groupB).Union(groupB.Except(groupA)).ToList()
Note - as with all linq, its worth pointing out that its not the most efficient way to do this. All list iterations have a cost. A longer-hand way of sorting both lists then iterating over them together would be quicker if the lists were REALLY large...
I would use the Intersect
and Except
methods:
dups = groupA.Intersect(groupB).ToList();
distinct = groupA.Except(groupB).ToList();
When you remove an item from a list, you move the index of the remaining element down.
In essence, you are skipping some items using a for loop.
Try using a while loop, and manually increment the counter when you are not deleting an item.
For example, the following code is incorrect
List<int> nums = new List<int>{2, 4, 6, 7, 8, 10, 11};
for (int i = 0; i < nums.Count; i++)
{
if (nums[i] % 2 == 0)
nums.Remove(nums[i]);
}
If will return the list {4, 7, 10, 11}
instead of just {7, 11}
.
It will not remove the value of 4, because, when I remove the value of 2, (for i=0
) the nums
list goes from
//index 0 1 2 3 4 5 6
nums = {2, 4, 6, 7, 8, 10, 11}
to
//index 0 1 2 3 4 5
nums = {4, 6, 7, 8, 10, 11}
The loop finishes, the i is incremented to 1, and the next item referenced is nums[1]
, which is not 4 as one would intuitively expect, but 6. So in effect the value of 4 is skipped, and the check is not executed.
You should be very, very careful each time when you are modifying the collection you are iterating. For example, the foreach
statement will throw an exception if you even try this. In this case you could use a while like
List<int> nums = new List<int>{2, 4, 6, 7, 8, 10, 11};
int i = 0;
while (i < nums.Count)
{
if (nums[i] % 2 == 0)
{
nums.Remove(nums[i])
}
else
{
i++; //only increment if you are not removing an item
//otherwise re-run the loop for the same value of i
}
}
of you could even fork the for, like
for (int i = 0; i < nums.Count; i++)
{
if (nums[i] % 2 == 0)
{
nums.Remove(nums[i]);
i--; //decrement the counter, so that it will stay in place
//when it is incremented at the end of the loop
}
}
Alternatively you could use linq, like this:
distinct.AddRange(groupA);
distinct.AddRange(groupB);
distinct = distinct.Distinct().ToList();
and
dups.AddRange(groupA);
dups.AddRange(groupB);
dups = dups.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key)
.ToList();
Note that the LINQ code will not alter your existing groupA and groupB lists. If you just want to distinct them, you could just do
groupA = groupA.Distinct().ToList();
groupB = groupB.Distinct().ToList();
You can easily do it with Linq:
List<int> dups = groupA.Intersect(groupB).ToList();
List<int> distinct = groupA.Except(groupB).ToList();
(assuming I correctly understood what you were trying to do)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With