Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the Linq-to-Objects sum of a sequence of nullables itself nullable?

As usual, int? means System.Nullable<int> (or System.Nullable`1[System.Int32]).

Suppose you have an in-memory IEnumerable<int?> (such as a List<int?> for example), let us call it seq; then you can find its sum with:

var seqSum = seq.Sum();

Of course this goes to the extension method overload int? IEnumerable<int?>.Sum() (documentation) which is really a static method on System.Linq.Enumerable.

However, the method never returns null, so why is the return type declared as Nullable<>? Even in cases where seq is an empty collection or more generally a collection all of whose elements are the null value of type int?, the Sum method in question still returns zero, not null.

This is evident from the documentation, but also from the System.Core.dll source code:

public static int? Sum(this IEnumerable<int?> source) { 
    if (source == null) throw Error.ArgumentNull("source"); 
    int sum = 0; 
    checked { 
        foreach (int? v in source) { 
            if (v != null) sum += v.GetValueOrDefault(); 
        } 
    } 
    return sum; 
} 

Note that there is only one return statement and its expression sum has type int (which will then implicitly be converted to int? by a wrapping).

It seems wasteful to always wrap the return value. (The caller could always do the wrapping implicitly on his side if desired.)

Besides, this return type may lead the caller into writing code such as if (!seqSum.HasValue) { /* logic to handle this */ } which will in reality be unreachable (a fact which the C# compiler cannot know of).

So why is this return parameter not simply declared as int with no nullable?

I wonder if there is any benefit of having the same return type as int? IQueryable<int?>.Sum() (in System.Linq.Queryable class). This latter method may return null in practice if there are LINQ providers (maybe LINQ to SQL?) that implement it so.

like image 656
Jeppe Stig Nielsen Avatar asked Dec 08 '16 13:12

Jeppe Stig Nielsen


People also ask

Can LINQ Sum return null?

LINQ to SQL and LINQ to Entities The problem is the SQL SUM operator which returns NULL for empty sequences. When the result is returned to LINQ to SQL or Entity Framework it fails miserably when trying to assign the NULL value into a non-nullable int .

How do you sum in Linq?

In LINQ, you can find the sum of the given numeric elements by using the Sum() method. This method calculates the sum of the numeric value present in the given sequence. It does not support query syntax in C#, but it supports in VB.NET. It is available in both Enumerable and Queryable classes in C#.


1 Answers

Several comments have mentioned that this isn't really answerable (or only opinion based without official response). I won't argue that. However, one can still perform analysis on available code and form a strong enough theory. Mine is simply that this is a an existing MS pattern.

If you look through the rest of System.Linq.Enumerable, in particular the math related functions, you start to see a pattern of having the tendency to return the same type as the input parameter, unless the return has a specific reason to be of a different type.

See the following functions:

Max():

public static int Max(this IEnumerable<int> source);
public static int? Max(this IEnumerable<int?> source);
public static long Max(this IEnumerable<long> source);
public static long? Max(this IEnumerable<long?> source);

Min():

public static int Min(this IEnumerable<int> source);
public static int? Min(this IEnumerable<int?> source);
public static long Min(this IEnumerable<long> source);
public static long? Min(this IEnumerable<long?> source);

Sum():

public static int Sum(this IEnumerable<int> source);
public static int? Sum(this IEnumerable<int?> source);
public static long Sum(this IEnumerable<long> source);
public static long? Sum(this IEnumerable<long?> source);

For the exception to the rule, take a look at Average...

public static double Average(this IEnumerable<int> source);
public static double? Average(this IEnumerable<int?> source);

You can see that it still retains the Nullable<T> type, however the return type must be altered to a suitable type to support the result that averaging integers together yields.

When you look further into Average though, you see the following:

public static float Average(this IEnumerable<float> source);
public static float? Average(this IEnumerable<float?> source);

Again, back to the default pattern of returning the same type as the original incoming type.

Now that we see this pattern here, let's see if we see this anywhere else... let's take a look at System.Math since we are on that subject.

Again, here we see the same pattern of using the same return type:

public static int Abs(int value);
public static long Abs(long value);

public static int Max(int val1, int val2);
public static long Max(long val1, long val2);

I'll mention it again, this is what amounts to an "opinion answer". I have looked for any MS best practices or language specification information that might hint at this being a language pattern for MS to back up my analysis, but I could not find anything. That being said, if you look at various places in the .Net core libraries, especially the System.Collections.Generic namespace, you will see that unless there is specific reason, the return type matches the collection type.

I see no reason for that rule to be deviated from when it comes to Nullable<T> types.

like image 68
gmiley Avatar answered Oct 09 '22 07:10

gmiley