Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgresql Query - Ordering by result of subquery

I have a query where I want to select all users who like a given set of artists. There also are some other WHERE criteria on country etc. Here's what the schema looks like.

          users                     favourite_artists             artists

+----------+------------+    +-----------+------------+    +--------+--------+
|    id    |  country   |    |  user_id  |  artist_id |    |  id    |  name  |
+----------+------------+    +-----------+------------+    +--------+--------+
|     1    |     gb     |    |     1     |      6     |    |   1    |  Muse  |
|     2    |     gb     |    |     1     |      5     |    |   2    |  RATM  |
|     3    |     us     |    |     1     |      3     |    |   3    |  ABBA  | 
|     4    |     us     |    |     2     |      3     |    |   4    |   U2   |
+----------+------------+    +-----------+------------+    +--------+--------+

I want to order them by the number of those artists they like. I also want to include users who don't like any of the artists but who match the WHERE criteria. The expected result set would look like.

+--------+---------------+----------------+
|   id   |    country    |   match_count  |
+--------+---------------+----------------+
|    6   |      gb       |       4        |
|    9   |      gb       |       4        |
|    2   |      gb       |       3        |
|    1   |      gb       |       2        |
|    5   |      gb       |       0        |
|    4   |      gb       |       0        |
+--------+---------------+----------------+

I've been trying to do it using a subquery to get the match_count and ordering by that but it's performing pretty slowly so I thought there'd have to be a better way.

   SELECT users.id, users.country
   (SELECT COUNT(*) FROM favourite_artists 
    WHERE user_id = users.id AND artist_id IN (1,3,4,9)) AS match_count        
   FROM "users" 
   WHERE users.country = 'gb'
   ORDER BY match_count DESC;

I'm using Postgresql 9.0.7. Any thoughts?

like image 599
johnnymire Avatar asked May 04 '12 10:05

johnnymire


1 Answers

Your query is executing one subquery for every row in users. Such queries are called "correlated subqueries" and their performance, quite understandably, sucks.

Instead you want a join:

SELECT users.id, users.country, count(artist_id) as match_count
FROM users
LEFT JOIN favourite_artists ON user_id = users.id AND artist_id IN (1,3,4,9)
WHERE users.country = 'gb'
GROUP BY 1, 2
ORDER BY 3 DESC;

This query will get the joining rows far more efficiently, assuming you have an index on favourite_artists(user_id) - or better yet a multi-column index favourite_artists(user_id, artist_id).

like image 183
Bohemian Avatar answered Oct 12 '22 23:10

Bohemian