Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq running total 1st value added to itself

Tags:

c#

linq

I have the below which calculates the running total for a customer account status, however he first value is always added to itself and I'm not sure why - though I suspect I've missed something obvious:

    decimal? runningTotal = 0;
    IEnumerable<StatementModel> statement = sage.Repository<FDSSLTransactionHistory>()
        .Queryable()
        .Where(x => x.CustomerAccountNumber == sageAccount)
        .OrderBy(x=>x.UniqueReferenceNumber)
        .AsEnumerable()
        .Select(x => new StatementModel()
        {
            SLAccountId = x.CustomerAccountNumber,
            TransactionReference = x.TransactionReference,
            SecondReference = x.SecondReference,
            Currency = x.CurrencyCode,
            Value = x.GoodsValueInAccountCurrency,
            TransactionDate = x.TransactionDate,
            TransactionType = x.TransactionType,
            TransactionDescription = x.TransactionTypeName,
            Status = x.Status,
            RunningTotal = (runningTotal += x.GoodsValueInAccountCurrency)
        });

Which outputs:

29/02/2012 00:00:00 154.80  309.60  
30/04/2012 00:00:00 242.40  552.00  
30/04/2012 00:00:00 242.40  794.40  
30/04/2012 00:00:00 117.60  912.00  

Where the 309.60 of the first row should be simply 154.80

What have I done wrong?

EDIT: As per ahruss's comment below, I was calling Any() on the result in my View, causing the first to be evaluated twice - to resolve I appended ToList() to my query.

Thanks all for your suggestions

like image 580
SWa Avatar asked Jul 30 '14 12:07

SWa


2 Answers

Add a ToList() to the end of the call to avoid duplicate invocations of the selector.

This is a stateful LINQ query with side-effects, which is by nature unpredictable. Somewhere else in the code, you called something that caused the first element to be evaluated, like First() or Any(). In general, it is dangerous to have side-effects in LINQ queries, and when you find yourself needing them, it's time to think about whether or not it should just be a foreach instead.

Edit, or Why is this happening?

This is a result of how LINQ queries are evaluated: until you actually use the results of a query, nothing really happens to the collection. It doesn't evaluate any of the elements. Instead, it stores Abstract Expression Trees or just the delegates it needs to evaluate the query. Then, it evaluates those only when the results are needed, and unless you explicitly store the results, they're thrown away afterwards, and re-evaluated the next time.

So this makes the question why does it have different results each time? The answer is that runningTotal is only initialized the first time around. After that, its value is whatever it was after the last execution of the query, which can lead to strange results.

This means the question could just have easily have been "Why is the total always twice what it should be?" if the asker were doing something like this:

Console.WriteLine(statement.Count()); // this enumerates all the elements!
foreach (var item in statement) { Console.WriteLine(item.Total); }

Because the only way to get the number of elements in the sequence is to actually evaluate all of them.

Similarly, what actually happened in this question was that somewhere there was code like this:

if (statement.Any()) // this actually involves getting the first result
{ 
    // do something with the statement
}
// ...
foreach (var item in statement) { Console.WriteLine(item.Total); }

It seems innocuous, but if you know how LINQ and IEnumerable work, you know that .Any() is basically the same as .GetEnumerator().MoveNext(), which makes it more obvious that it requires getting the first element.

It all boils down to the fact that LINQ is based on deferred execution, which is why the solution is to use ToList, which circumvents that and forces immediate execution.

like image 110
ahruss Avatar answered Oct 15 '22 22:10

ahruss


If you don't want to freeze the results with ToList, a solution to the outer scope variable problem is using an iterator function, like this:

IEnumerable<StatementModel> GetStatement(IEnumerable<DataObject> source) {
    decimal runningTotal = 0;
    foreach (var x in source) {
        yield return new StatementModel() {
            ...
            RunningTotal = (runningTotal += x.GoodsValueInAccountCurrency)
        };
    }
}

Then pass to this function the source query (not including the Select):

var statement = GetStatement(sage.Repository...AsEnumerable());

Now it is safe to enumerate statement multiple times. Basically, this creates an enumerable that re-executes this entire block on each enumeration, as opposed to executing a selector (which equates to only the foreach part) -- so runningTotal will be reset.

like image 1
nmclean Avatar answered Oct 15 '22 22:10

nmclean