Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weighted Average in LINQ to Entities

Please consider this tables:

Weight:

City          Weight
--------------------
Tokyo          100
München        150
Köln           200

and Data:

ID       Country       City           Value
--------------------------------------------
1        Germany       München        10
2        Germany       München        20
3        Japan         Tokyo          12
4        Japan         Tokyo          20
5        Japan         Tokyo          8
6        Germany       Köln           5 
7        Germany       Köln           7
8        Germany       Köln           9

I want to calculate Weighted Average for each Country:

I wrote this query:

var MyResult = (from d in MyContext.Data
                join w in MyContext.Weight
                on d.City equals w.City
                select new {
                   d.Country,
                   d.City,
                   d.Value,
                   w.Weight
                }).GroupBy(p=>new {p.Country})
                  .Select(o=>new
                             {
                                 o.Key.Country,
                                 WeightedAverage = o.Sum(k=>k.Value * k.Weight) / o.Sum(k=>k.Weight)
                             })

But it returns wrong Weighted Average for me. I want to calculate this formula:

For Germany:

(10 * 150 + 20 * 150 + 5 * 200 + 7 * 200 + 9 * 200) / (150 + 150 + 200 + 200 + 200)

How can I achieve my desire result ?

Thanks

like image 949
Arian Avatar asked Oct 16 '25 20:10

Arian


2 Answers

Could you please write SQL version of query?

Since you have very your data denormalized in very peculiar way one way to handle it would be by creating a subquery which will create the Country -> SUM(Weight) "mapping". For example (since SQL Sever does not seem to have first aggregate function I substitute it here with max which will not work correctly in cases when some data is missing in Weight table):

WITH WeightDict(Country, Weight) as (
  select Country, sum(Weight) as Weight
  from (
   select Country, w.City, max(Weight) as Weight
   from Weight w
   join Data d on w.City = d.City
   group by Country, w.City) sub
  group by Country
)

select d.Country, 1.0 * sum(d.Value * w.Weight)/max(wd.Weight) as WeightedAverage 
from Data d
join Weight w on w.City = d.City
join WeightDict wd on d.Country = wd.Country
group by d.Country

Sample db fiddle - https://www.db-fiddle.com/f/fg1rXi5gTPjcGv7MS35TGB/1 . Output:

country weightedaverage
Germany 24.8571428571428571
Japan 40.0000000000000000

Not sure that this is possible to rewrite in form of EF Core LINQ query but you can try to do it in parts (i.e. one for fetching dictionary and another to fetch weighted sum - sum(d.Value * w.Weight)) and perform final join in memory.

like image 58
Guru Stron Avatar answered Oct 19 '25 09:10

Guru Stron


I am not so sure EF can translate this too but it will be too long to write in a comment so I will post it as an answer. Try this :

var MyResult = (from d in MyContext.Data
                join w in MyContext.Weight
                on d.City equals w.City
                select new {
                   d.Country,
                   d.City,
                   d.Value,
                   w.Weight
                }).GroupBy(p=>new {p.Country})
                  .Select(o=>new
                             {
                                 o.Key.Country,
                                 WeightedAverage = o.Sum(k=>k.Value * k.Weight) / 
       MyContext.Weight.Where(w=> o.Select(q=> q.City).Contains(w.City))
                       .Select(w=> w.Weight))).Sum()
                             })

 // or another alternate:
 WeightedAverage = o.Sum(k=>k.Value * k.Weight) /
 o.GroupBy(r=> r.City).Select(r=> r.First())Sum(k=>k.Weight)

Edit:

If the above query doesn't work, only two options left: Enumerate the query so you can use in-memory LINQ where you can utilize DistinctBy or write a raw SQL using a library like Dapper

Here is the equivalent SQL for SQL Server :

SELECT Sum(w.weight * d.value) / Sum(q.weight) WeightedAverage,
       d.country
FROM   weights w
       INNER JOIN data d
               ON w.city = d.city
       CROSS apply (SELECT weight
                    FROM   weights z
                    WHERE  z.city IN (SELECT x.city
                                      FROM   data x
                                      WHERE  x.country = d.country)) q
GROUP  BY d.country 

Fiddle

And BTW having a Country column in the Weight table would make writing queries easier.

like image 26
Eldar Avatar answered Oct 19 '25 09:10

Eldar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!