Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails 3.1 with PostgreSQL: GROUP BY must be used in an aggregate function

I am trying to load the latest 10 Arts grouped by the user_id and ordered by created_at. This works fine with SqlLite and MySQL, but gives an error on my new PostgreSQL database.

Art.all(:order => "created_at desc", :limit => 10, :group => "user_id")

ActiveRecord error:

Art Load (18.4ms)  SELECT "arts".* FROM "arts" GROUP BY user_id ORDER BY created_at desc LIMIT 10
ActiveRecord::StatementInvalid: PGError: ERROR:  column "arts.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT  "arts".* FROM "arts"  GROUP BY user_id ORDER BY crea...

Any ideas?

like image 205
atmorell Avatar asked Aug 05 '11 08:08

atmorell


3 Answers

The sql generated by the expression is not a valid query, you are grouping by user_id and selecting lot of other fields based on that but not telling the DB how it should aggregate the other fileds. For example, if your data looks like this:

a  | b
---|---
1  | 1
1  | 2
2  | 3

Now when you ask db to group by a and also return b, it doesn't know how to aggregate values 1,2. You need to tell if it needs to select min, max, average, sum or something else. Just as I was writing the answer there have been two answers which might explain all this better.

In your use case though, I think you don't want a group by on db level. As there are only 10 arts, you can group them in your application. Don't use this method with thousands of arts though:

 arts = Art.all(:order => "created_at desc", :limit => 10)
 grouped_arts = arts.group_by {|art| art.user_id}
 # now you have a hash with following structure in grouped_arts
 # { 
 #    user_id1 => [art1, art4],
 #    user_id2 => [art3],
 #    user_id3 => [art5],
 #    ....
 # }

EDIT: Select latest_arts, but only one art per user

Just to give you the idea of sql(have not tested it as I don't have RDBMS installed on my system)

SELECT arts.* FROM arts
WHERE (arts.user_id, arts.created_at) IN 
  (SELECT user_id, MAX(created_at) FROM arts
     GROUP BY user_id
     ORDER BY MAX(created_at) DESC
     LIMIT 10)
ORDER BY created_at DESC
LIMIT 10

This solution is based on the practical assumption, that no two arts for same user can have same highest created_at, but it may well be wrong if you are importing or programitically creating bulk of arts. If assumption doesn't hold true, the sql might get more contrieved.

EDIT: Attempt to change the query to Arel:

Art.where("(arts.user_id, arts.created_at) IN 
             (SELECT user_id, MAX(created_at) FROM arts
                GROUP BY user_id
                ORDER BY MAX(created_at) DESC
                LIMIT 10)").
    order("created_at DESC").
    page(params[:page]).
    per(params[:per])
like image 54
rubish Avatar answered Nov 03 '22 02:11

rubish


You need to select the specific columns you need

Art.select(:user_id).group(:user_id).limit(10)

It will raise error when you try to select title in the query, for example

Art.select(:user_id, :title).group(:user_id).limit(10)

column "arts.title" must appear in the GROUP BY clause or be used in an aggregate function

That is because when you try to group by user_id, the query has no idea how to handle the title in the group, because the group contains several titles.

so the exception already mention you need to appear in group by

Art.select(:user_id, :title).group(:user_id, :title).limit(10)

or be used in an aggregate function

Art.select("user_id, array_agg(title) as titles").group(:user_id).limit(10)

like image 6
Ilake Chang Avatar answered Nov 03 '22 02:11

Ilake Chang


Take a look at this post SQLite to Postgres (Heroku) GROUP BY

PostGres is actually following the SQL standard here whilst sqlite and mysql break from the standard.

like image 2
John Beynon Avatar answered Nov 03 '22 01:11

John Beynon