Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare two List<int>

Tags:

c#

I am writing a small program to compare two List. If the values are the same, I add them to list dups, if they are different, I add them to distinct. I noticed that some of my values are added and some are not, and after debugging for awhile, I am not certain what the problem is. Can someone shed a little light? Thanks.

        List<int> groupA = new List<int>();
        List<int> groupB = new List<int>();

        List<int> dups = new List<int>();
        List<int> distinct = new List<int>();

        groupA.Add(2);
        groupA.Add(24);
        groupA.Add(5);
        groupA.Add(72);
        groupA.Add(276);
        groupA.Add(42);
        groupA.Add(92);
        groupA.Add(95);
        groupA.Add(266);
        groupA.Add(42);
        groupA.Add(92);


        groupB.Add(5);
        groupB.Add(42);
        groupB.Add(95);

        groupA.Sort();
        groupB.Sort();

        for (int a = 0; a < groupA.Count; a++)
        {
            for (int b = 0; b < groupB.Count; b++)
            {
                groupA[a].CompareTo(groupB[b]);


                if (groupA[a] == groupB[b])
                {
                    dups.Add(groupA[a]);
                    groupA.Remove(groupA[a]);
                    groupB.Remove(groupB[b]);
                }

            }
            distinct.Add(groupA[a]);
        }
like image 798
jpavlov Avatar asked Jan 08 '13 14:01

jpavlov


4 Answers

You need to find the missing elements in both:

List<int> onlyInA = groupA.Except(groupB).ToList();
List<int> onlyInB = groupB.Except(groupA).ToList();

Or in a single linq:

List<int> missing = groupA.Except(groupB).Union(groupB.Except(groupA)).ToList()

Note - as with all linq, its worth pointing out that its not the most efficient way to do this. All list iterations have a cost. A longer-hand way of sorting both lists then iterating over them together would be quicker if the lists were REALLY large...

like image 53
James Joyce Avatar answered Sep 30 '22 06:09

James Joyce


I would use the Intersect and Except methods:

dups = groupA.Intersect(groupB).ToList();
distinct = groupA.Except(groupB).ToList();
like image 25
Massimiliano Peluso Avatar answered Nov 15 '22 08:11

Massimiliano Peluso


When you remove an item from a list, you move the index of the remaining element down. In essence, you are skipping some items using a for loop.
Try using a while loop, and manually increment the counter when you are not deleting an item.

For example, the following code is incorrect

List<int> nums = new List<int>{2, 4, 6, 7, 8, 10, 11};

for (int i = 0; i < nums.Count; i++)
{
  if (nums[i] % 2 == 0)
    nums.Remove(nums[i]);
}

If will return the list {4, 7, 10, 11} instead of just {7, 11}.

It will not remove the value of 4, because, when I remove the value of 2, (for i=0) the nums list goes from

//index 0  1  2  3  4   5   6 
nums = {2, 4, 6, 7, 8, 10, 11}

to

//index 0  1  2  3  4   5
nums = {4, 6, 7, 8, 10, 11}

The loop finishes, the i is incremented to 1, and the next item referenced is nums[1], which is not 4 as one would intuitively expect, but 6. So in effect the value of 4 is skipped, and the check is not executed.

You should be very, very careful each time when you are modifying the collection you are iterating. For example, the foreach statement will throw an exception if you even try this. In this case you could use a while like

List<int> nums = new List<int>{2, 4, 6, 7, 8, 10, 11};

int i = 0;
while (i < nums.Count)
{
  if (nums[i] % 2 == 0)
  {
    nums.Remove(nums[i])
  }      
  else
  {
    i++; //only increment if you are not removing an item
         //otherwise re-run the loop for the same value of i
  }  
}

of you could even fork the for, like

for (int i = 0; i < nums.Count; i++)
{
  if (nums[i] % 2 == 0)
  {
    nums.Remove(nums[i]);
    i--; //decrement the counter, so that it will stay in place
         //when it is incremented at the end of the loop
  }
}

Alternatively you could use linq, like this:

distinct.AddRange(groupA);
distinct.AddRange(groupB);
distinct = distinct.Distinct().ToList();

and

dups.AddRange(groupA);
dups.AddRange(groupB);

dups = dups.GroupBy(i => i)
           .Where(g => g.Count() > 1)
           .Select(g => g.Key)
           .ToList();

Note that the LINQ code will not alter your existing groupA and groupB lists. If you just want to distinct them, you could just do

groupA = groupA.Distinct().ToList();
groupB = groupB.Distinct().ToList();
like image 8
SWeko Avatar answered Nov 15 '22 09:11

SWeko


You can easily do it with Linq:

    List<int> dups = groupA.Intersect(groupB).ToList();
    List<int> distinct = groupA.Except(groupB).ToList();

(assuming I correctly understood what you were trying to do)

like image 5
Thomas Levesque Avatar answered Nov 15 '22 08:11

Thomas Levesque