Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate columns with additional (distinct) filters

This code works as expected, but I it's long and creepy.

select p.name, p.played, w.won, l.lost from

(select users.name, count(games.name) as played
from users
inner join games on games.player_1_id = users.id
where games.winner_id > 0
group by users.name
union
select users.name, count(games.name) as played
from users
inner join games on games.player_2_id = users.id
where games.winner_id > 0
group by users.name) as p

inner join

(select users.name, count(games.name) as won
from users
inner join games on games.player_1_id = users.id
where games.winner_id = users.id
group by users.name
union
select users.name, count(games.name) as won
from users
inner join games on games.player_2_id = users.id
where games.winner_id = users.id
group by users.name) as w on p.name = w.name

inner join

(select users.name, count(games.name) as lost
from users
inner join games on games.player_1_id = users.id
where games.winner_id != users.id
group by users.name
union
select users.name, count(games.name) as lost
from users
inner join games on games.player_2_id = users.id
where games.winner_id != users.id
group by users.name) as l on l.name = p.name

As you can see, it consists of 3 repetitive parts for retrieving:

  • player name and the amount of games they played
  • player name and the amount of games they won
  • player name and the amount of games they lost

And each of those also consists of 2 parts:

  • player name and the amount of games in which they participated as player_1
  • player name and the amount of games in which they participated as player_2

How could this be simplified?

The result looks like so:

           name            | played | won | lost 
---------------------------+--------+-----+------
 player_a                  |      5 |   2 |    3
 player_b                  |      3 |   2 |    1
 player_c                  |      2 |   1 |    1
like image 928
ave Avatar asked Jan 09 '23 09:01

ave


2 Answers

The standard-SQL aggregate FILTER clause in Postgres 9.4 or newer is shorter and faster:

SELECT u.name
     , count(*) FILTER (WHERE g.winner_id  > 0)    AS played
     , count(*) FILTER (WHERE g.winner_id  = u.id) AS won
     , count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;
  • The manual
  • Postgres Wiki
  • Depesz blog post

In Postgres 9.3 (or any version) this is still shorter and faster than nested sub-selects or CASE expressions:

SELECT u.name
     , count(g.winner_id  > 0 OR NULL)    AS played
     , count(g.winner_id  = u.id OR NULL) AS won
     , count(g.winner_id <> u.id OR NULL) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

See:

  • For absolute performance, is SUM faster or COUNT?
like image 153
Erwin Brandstetter Avatar answered Jan 19 '23 06:01

Erwin Brandstetter


This is a case where correlated subqueries may simplify the logic:

select u.*, (played - won) as lost
from (select u.*,
             (select count(*)
              from games g
              where g.player_1_id = u.id or g.player_2_id = u.id
             ) as played,
             (select count(*)
              from games g
              where g.winner_id = u.id
             ) as won
      from users u
     ) u;

This assumes that there are no ties.

like image 38
Gordon Linoff Avatar answered Jan 19 '23 05:01

Gordon Linoff