Consider the following simple code with LINQ OrderBy
and ThenBy
:
static void Main()
{
var arr1 = new[] { "Alpha", "Bravo", "Charlie", };
var coStr = Comparer<string>.Create((x, y) =>
{
Console.WriteLine($"Strings: {x} versus {y}");
return string.CompareOrdinal(x, y);
});
arr1.OrderBy(x => x, coStr).ToList();
Console.WriteLine("--");
var arr2 = new[]
{
new { P = "Alpha", Q = 7, },
new { P = "Bravo", Q = 9, },
new { P = "Charlie", Q = 13, },
};
var coInt = Comparer<int>.Create((x, y) =>
{
Console.WriteLine($"Ints: {x} versus {y}");
return x.CompareTo(y);
});
arr2.OrderBy(x => x.P, coStr).ThenBy(x => x.Q, coInt).ToList();
}
This simply uses some comparers that write out to the console what they compare.
On my hardware and version of the Framework (.NET 4.6.2), this is the output:
Strings: Bravo versus Alpha Strings: Bravo versus Bravo Strings: Bravo versus Charlie Strings: Bravo versus Bravo -- Strings: Bravo versus Alpha Strings: Bravo versus Bravo Ints: 9 versus 9 Strings: Bravo versus Charlie Strings: Bravo versus Bravo Ints: 9 versus 9
My question is: Why would they compare an item from the query to itself?
In the first case, before the --
separator, they do four comparisons. Two of them compare an entry to itself ("Strings: Bravo versus Bravo"). Why?
In the second case, there should not ever be a need for resorting to comparing the Q
properties (integers); for there are no duplicates (wrt. ordinal comparison) in the P
values, so no tie-breaking from ThenBy
should be needed ever. Still we see "Ints: 9 versus 9" twice. Why use the ThenBy
comparer with identical arguments?
Note: Any comparer has to return 0
upon comparing something to itself. So unless the algorithm just wants to check if we implemented a comparer correctly (which it will never be able to do fully anyway), what is going on?
Be aware: There are no duplicates in the elements yielded by the queries in my examples.
I saw the same issue with another example with more entries yielded from the query. Above I just give a small example. This happens with an even number of elements yielded, as well.
Generally, ThenBy method is used with the OrderBy method. The OrderBy() Method, first sort the elements of the sequence or collection in ascending order after that ThenBy() method is used to again sort the result of OrderBy() method in ascending order.
LINQ is a data querying API that provides querying capabilities to . NET languages with a syntax similar to a SQL. LINQ queries use C# collections to return data. LINQ in C# is used to work with data access from sources such as objects, data sets, SQL Server, and XML.
Sorts the elements of a sequence in ascending order according to a key. OrderBy<TSource,TKey>(IEnumerable<TSource>, Func<TSource,TKey>, IComparer<TKey>) Sorts the elements of a sequence in ascending order by using a specified comparer.
In LINQ, the OrderBy operator is used to sort the list/ collection values in ascending order. In LINQ, if we use order by the operator by default, it will sort the list of values in ascending order. We don't need to add any ascending condition in the query statement.
In the reference source of the QuickSort
method used by OrderBy
you can see these two lines:
while (i < map.Length && CompareKeys(x, map[i]) > 0) i++;
while (j >= 0 && CompareKeys(x, map[j]) < 0) j--;
These while
loops run until they find an element that is no longer "greater" (resp. "less") than the one x
points to. So they will break when the identical element is compared.
I can't prove it mathematical, but I guess to avoid comparing identical elements would make the algorithm more complicated and introduce overhead that would impact performance more than this single comparison.
(Note that your comparer should be implemented clever enough to quickly return 0
for identical elements)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With