Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq query giving inappropriate output

I have two transaction tables named as ParentTransaction and ChildTransaction in which TransactionId of ParentTransaction will act as foreign to ChildTransaction of TransactionId.

Now I want to get all those TransactionId of ParentTransaction whose payamount is not completed.

From below output I want record of transaction Id 3 because only 1000 has been paid for transactionid 3 instead of 5000.

I have one table like this:

Transactionid(p.k)    PayAmount
  1                   1000
  2                   3000
  3                   5000
  4                   6000

ChildTransaction

Id        TransactionId(F.k)   DepositAmount
1           1                  600
2           1                  400
3           2                  1000
4           2                  1000
5           2                  1000
6           3                  2000

This is my query:

var data = (from tmp in context.ParentTransaction
            join tmp1 in context.ChildTransaction on tmp.Transactionid equals
            tmp1.Transactionid where tmp.PayAmount !=tmp1.DepositAmount
                    select tmp);

But here I am getting Transaction Id 1 and 2 although their transaction has been completed in two parts that is 600 and 400 for transaction id 1.

like image 793
Learning-Overthinker-Confused Avatar asked Jan 12 '16 06:01

Learning-Overthinker-Confused


2 Answers

The general idea of query languages is to express the desired result, not how to get it.

Applying it to your scenario leads to a simple query like this

var query = context.ParentTransaction
   .Where(t => t.PayAmount != context.ChildTransaction
      .Where(ct => ct.TransactionId == t.TransactionId)
      .Sum(ct => ct.DepositAmount));

If you are using EF and a proper model navigation properties, it would be even simple

var query = context.ParentTransaction
    .Where(t => t.PayAmount != t.ChildTransactions.Sum(ct => ct.DepositAmount));

One may say the above would be inefficient compared to let say the one from @Vadim Martynov answer. Well, may be yes, may be not. Vadim is trying to force a specific execution plan and I can understand that - we have to do such things when in reality encounter a query performance issues. But it's not natural and should be a last resort only if we have a performance problems. Query providers and SQL query optimizers will do (and are doing) that job for us in most of the cases, so we don't need to think of whether we need to use a join vs subquery etc.

like image 125
Ivan Stoev Avatar answered Oct 19 '22 17:10

Ivan Stoev


I'm not sure that != is a best value. Here is a solution with > check and grouping:

var expectedValue =
            context.ChildTransaction
                .GroupBy(t => t.TransactionId, (key, group) => new { TransactionId = key, Deposit = group.Sum(e => e.Deposit) })
                .Join(context.ParentTransaction, grouped => grouped.TransactionId, transaction => transaction.TransactionId, (group, transaction) => new { Transaction = transaction, group.Deposit })
                .Where(result => result.Transaction.PayAmount > result.Deposit)
                .Select(result => result.Transaction);

This query can be read in a declare manner like next requirement:

  1. Group collection of child transactions by TransactionId and for each group retrieve an anonymous type object with fields TransactionId = grouping key (== TransactionId) and Deposit which is sum of Deposits for rows with same TransactionId.
  2. Join set from part 1 to the the table PaerntTransaction by TransactionId field. For each joined pair retrieve an anonymous type object with fields Transaction == transaction from ParentTransactions table and Deposit which is deposit from part 1 set which is sum of Deposits with the same TransactionId from the ChildTransactions table.
  3. Filter from result set only objects where PayAmount greather than sum of deposits.
  4. Return only ParentTransaction object for each filtered row.

This is SQL-optimized scenario because join, filter and grouping prevents nested queries which can be added to the actual execution plan in other cases and make worse performance.

UPDATE

To solve the problem with transaction that have no deposits you can use LEFT JOIN:

var expectedValue = from parent in context.ParentTransaction
            join child in context.ChildTransaction on parent.TransactionId equals child.TransactionId into gj
            from subset in gj.DefaultIfEmpty()
            let joined = new { Transaction = parent, Deposit = subset != null ? subset.Deposit : 0 }
            group joined by joined.Transaction
            into grouped
            let g = new { Transaction = grouped.Key, Deposit = grouped.Sum(e => e.Deposit) }
            where g.Transaction.PayAmount > g.Deposit
            select g.Transaction;

The same query with LINQ method chain:

var expectedValue =
            context.ParentTransaction
                .GroupJoin(context.ChildTransaction, parent => parent.TransactionId, child => child.TransactionId, (parent, gj) => new { parent, gj })
                .SelectMany(@t => @t.gj.DefaultIfEmpty(), (@t, subset) => new { @t, subset })
                .Select(@t => new { @t, joined = new { Transaction = @[email protected], Deposit = @t.subset != null ? @t.subset.Deposit : 0 } })
                .GroupBy(@t => @t.joined.Transaction, @t => @t.joined)
                .Select(grouped => new { grouped, g = new { Transaction = grouped.Key, Deposit = grouped.Sum(e => e.Deposit) } })
                .Where(@t => @t.g.Transaction.PayAmount > @t.g.Deposit)
                .Select(@t => @t.g.Transaction);

Now you retrieve all parent transaction and join it with child transaction but if there is no children then use Deposit == 0 and group joined entities in a similar manner by ParentTransaction.

like image 33
Vadim Martynov Avatar answered Oct 19 '22 17:10

Vadim Martynov