ActionView::Template::Error (PG::Error: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
I'm creating an events website and I'm trying to sort the rendered rsvps by the start time of the event. There are a lot of RSVPS so I'm grouping them with distinct, but I've been having a lot of difficulty over the last few days on sorting the results without this error popping up on PG. I've looked at some of the previous questions on the topic and am still pretty lost. How can I get this to work? Thank you so much!
@rsvps = Rsvp.where(:voter_id => current_user.following.collect {|f| f["id"]}, :status => 'going').where("start_time > ? AND start_time < ?", Time.now, Time.now + 1.month).order("count_all desc").count(:group => :event_id).collect { |f| f[0] } <%= render :partial => 'rsvps/rsvp', :collection => Rsvp.where(:event_id => @rsvps).select("DISTINCT(event_id)").order('start_time asc') %>
I know this is a rather old question, but I just went through a small example in my head which helped me understand why Postgres has this seemingly odd restriction on SELECT DISTINCT / ORDER BY columns.
Imagine you have the following data in your Rsvp table:
event_id | start_time ----------+------------------------ 0 | Mar 17, 2013 12:00:00 1 | Jan 1, 1970 00:00:00 1 | Aug 21, 2013 16:30:00 2 | Jun 9, 2012 08:45:00
Now you want to grab a list of distinct event_ids, ordered by their respective start_times. But where should 1
go? Should it come first, because the one tuple starts on Jan 1, 1970, or should it go last because of the Aug 21, 2013?
As the database system can't make that decision for you and the syntax of the query can't depend on the actual data it might be operating on (assuming event_id
is unique), we are restricted to ordering only by columns from the SELECT
clause.
As for the actual question - an alternative to Matthew's answer is using an aggregate function like MIN
or MAX
for the sorting:
SELECT event_id FROM Rsvp GROUP BY event_id ORDER BY MIN(start_time)
The explicit grouping and aggregation on start_time
permit the database to come up with a unambiguous ordering of the result tuples. Note however, that readability is definitely an issue in this case ;)
The ORDER BY clause can only be applied after the DISTINCT has been applied. Since only the fields in the SELECT statement are taken into consideration for the DISTINCT operations, those are the only fields may be used in the ORDER BY.
Logically, if you just want a distinct list of event_id values, what order they occur in should be irrelevant. If order does matter, then you should add the start_time to the SELECT list so that there is context for the order.
Also, these two SELECT clauses are NOT equivalent, so be careful:
SELECT DISTINCT(event_id, start_time) FROM ... SELECT DISTINCT event_id, start_time FROM ...
The second is the form you want. The first will return a series of records with the data represented as a ROW construct (a single column with a tuple inside). The second will return normal columns of data output. It only works as expected in the single-column case where the ROW construct is reduced down since it is only a single column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With