Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq's Except Method Doesn't Work for Byte Values

Tags:

c#

.net

linq

    [TestMethod()]
    public void TestExceptWithRandomInput()
    {
        byte[] listA =  new byte[4096];
        var rand = new Random();
        rand.NextBytes(listA);

        byte[] listB = new byte[] { 0x00 };
        var nullCount = (from a in listA
                         where a == 0x00
                         select a);

        var listC = listA.Except(listB);

        Assert.AreEqual(4096, listA.Length);
        Assert.AreEqual(4096 - nullCount.Count(), listC.Count()); //Fails!!
    }

    [TestMethod()]
    public void TestWhereWithRandomInput()
    {
        byte[] listA = new byte[4096];
        var rand = new Random();
        rand.NextBytes(listA);

        byte[] listB = new byte[] { 0x00 };
        var nullCount = (from a in listA
                         where a == 0x00
                         select a);

        var listC = listA.Where(a => !listB.Contains(a));

        Assert.AreEqual(4096, listA.Length);
        Assert.AreEqual(4096 - nullCount.Count(), listC.Count()); //Successful
    }

The above code seems to fail when the Except() function is used but works fine when Where() is used. What seems to be missing? Do I need to implement the IEqualityComparer for byte? I thought that was only necessary for complex types.

like image 387
Monish Nagisetty Avatar asked Nov 30 '25 09:11

Monish Nagisetty


1 Answers

Except also gets rid of duplicates in the first argument. It's a set operation, and sets aren't meant to have duplicates - the same as Union, Intersect etc.

listA.Except(listB) gives all the unique, non-null bytes in listA.

If you want to get all non-null bytes in a sequence, listA.Where(b => b != 0x00) is probably the logical thing to do.

If you want to count the null bytes, listA.Count(b => b == 0x00) expresses this most clearly.

If you want an "Except but preserving duplicates", without doing a !Contains on every item which is not very efficient, you can do something like:

public static IEnumerable<T> ExceptWithDuplicates<T>(
    this IEnumerable<T> source1,
    IEnumerable<T> source2)
{
    HashSet<T> in2 = new HashSet<T>(source2);
    foreach(T s1 in source1)
    {
        if(!in2.Contains(s1)) // rather than if Add
        {
            yield return s1;
        }
    }
}

(Disclaimer: not written in an IDE.) This is essentially the same as the regular Except, but it doesn't add the source items to the internal HashSet, so it will return the same item more than once.

like image 62
Rawling Avatar answered Dec 02 '25 22:12

Rawling



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!