Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VB.NET linq group by with anonymous types not working as expected

I was toying around with some of the linq samples that come with LINQPad. In the "C# 3.0 in a Nutshell" folder, under Chater 9 - Grouping, there is a sample query called "Grouping by Multiple Keys". It contains the following query:

from n in new[] { "Tom", "Dick", "Harry", "Mary", "Jay" }.AsQueryable()
group n by new
{
    FirstLetter = n[0],
    Length = n.Length
}

I added the string "Jon" to the end of the array to get an actual grouping, and came up with the following result:

C# LINQPad result

This was exactly what I was expecting. Then, in LINQPad, I went to the VB.NET version of the same query:

' Manually added "Jon"
from n in new string() { "Tom", "Dick", "Harry", "Mary", "Jay", "Jon" }.AsQueryable() _
group by ng = new with _
{ _
    .FirstLetter = n(0), _
    .Length = n.Length _
} into group

The result does not properly group Jay/Jon together.

VB.NET LINQPad result

After pulling my hair out for a bit, I discovered this MSDN article discussing VB.NET anonymous types. In VB.NET they are mutable by default as opposed to C# where they are immutable. In VB, you need to add the Key keyword to make them immutable. So, I changed the query to this (notice the addition of Key):

from n in new string() { "Tom", "Dick", "Harry", "Mary", "Jay", "Jon" }.AsQueryable() _
group by ng = new with _
{ _
    Key .FirstLetter = n(0), _
    Key .Length = n.Length _
} into group

This gave me the correct result:

enter image description here

So my question is this:

  1. Why does mutability/immutability of anonymous types matter when linq does an equality comparison? Notably, in Linq-to-SQL it doesn't matter at all, which is likely just a product of the translation to SQL. But in Linq-to-objects it apparently makes all the difference.
  2. Why would MS have chosen to make VB's anonymous types mutable. I see no real advantage, and after mucking around with this issue I see some very real disadvantages. Namely that your linq queries can have subtle bugs.

-- EDIT --

Just an interesting extra piece of info... Apparently this is keyed property issue is widely known. I just didn't know what to Google for. It's been discussed here and here on stackoverflow. Here's another example of the issue using anonymous types and Distinct:

Dim items = New String() {"a", "b", "b", "c", "c", "c"}
Dim result = items.Select(Function(x) New With {.MyValue = x}).Distinct()
Dim result2 = items.Select(Function(x) New With {Key .MyValue = x}).Distinct()
'Debug.Assert(result.Count() = 3) ' Nope... it's 6!
Debug.Assert(result2.Count() = 3)
like image 547
mattmc3 Avatar asked Jul 14 '11 19:07

mattmc3


1 Answers

The Key modifier doesn't just affect mutability - it also affects the behaviour of Equals and GetHashCode. Only Key properties are included in those calculations... which clearly affects grouping etc.

As for why it's different for VB - I don't know. It seems odd to me too. I know I'm glad that C# works the way it does though :) Even if it could be argued that making properties optionally mutable makes sense, I don't see why it should be the default.

like image 169
Jon Skeet Avatar answered Oct 13 '22 19:10

Jon Skeet