Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails Arel selecting distinct columns

I've hit a slight block with the new scope methods (Arel 0.4.0, Rails 3.0.0.rc)

Basically I have:

A topics model, which has_many :comments, and a comments model (with a topic_id column) which belongs_to :topics.

I'm trying to fetch a collection of "Hot Topics", i.e. the topics that were most recently commented on. Current code is as follows:

# models/comment.rb
scope :recent, order("comments.created_at DESC")

# models/topic.rb
scope :hot, joins(:comments) & Comment.recent & limit(5)

If I execute Topic.hot.to_sql, the following query is fired:

SELECT "topics".* FROM "topics" INNER JOIN "comments"
ON "comments"."topic_id" = "topics"."id"
ORDER BY comments.created_at DESC LIMIT 5

This works fine, but it potentially returns duplicate topics - If topic #3 was recently commented on several times, it would be returned several times.

My question

How would I go about returning a distinct set of topics, bearing in mind that I still need to access the comments.created_at field, to display how long ago the last post was? I would imagine something along the lines of distinct or group_by, but I'm not too sure how best to go about it.

Any advice / suggestions are much appreciated - I've added a 100 rep bounty in hopes of coming to an elegant solution soon.

like image 750
Jeriko Avatar asked Aug 24 '10 08:08

Jeriko


4 Answers

Solution 1

This doesn't use Arel, but Rails 2.x syntax:

Topic.all(:select => "topics.*, C.id AS last_comment_id, 
                       C.created_at AS last_comment_at",
          :joins => "JOINS (
             SELECT DISTINCT A.id, A.topic_id, B.created_at
             FROM   messages A,
             (
               SELECT   topic_id, max(created_at) AS created_at
               FROM     comments
               GROUP BY topic_id
               ORDER BY created_at
               LIMIT 5
             ) B
             WHERE  A.user_id    = B.user_id AND 
                    A.created_at = B.created_at
           ) AS C ON topics.id = C.topic_id
          "
).each do |topic|
  p "topic id: #{topic.id}"
  p "last comment id: #{topic.last_comment_id}"
  p "last comment at: #{topic.last_comment_at}"
end

Make sure you index the created_at and topic_id column in the comments table.

Solution 2

Add a last_comment_id column in your Topic model. Update the last_comment_id after creating a comment. This approach is much faster than using complex SQL to determine the last comment.

E.g:

class Topic < ActiveRecord::Base
  has_many :comments
  belongs_to :last_comment, :class_name => "Comment"
  scope :hot, joins(:last_comment).order("comments.created_at DESC").limit(5)
end

class  Comment
  belongs_to :topic

  after_create :update_topic

  def update_topic
    topic.last_comment = self
    topic.save
    # OR better still
    # topic.update_attribute(:last_comment_id, id)
  end
end

This is much efficient than running a complex SQL query to determine the hot topics.

like image 122
Harish Shetty Avatar answered Oct 17 '22 15:10

Harish Shetty


This is not that elegant in most SQL implementations. One way is to first get the list of the five most recent comments grouped by topic_id. Then get the comments.created_at by sub selecting with the IN clause.

I'm very new to Arel but something like this could work

recent_unique_comments = Comment.group(c[:topic_id]) \
                                .order('comments.created_at DESC') \
                                .limit(5) \
                                .project(comments[:topic_id]
recent_topics = Topic.where(t[:topic_id].in(recent_unique_comments))

# Another experiment (there has to be another way...)

recent_comments = Comment.join(Topic) \
                         .on(Comment[:topic_id].eq(Topic[:topic_id])) \ 
                         .where(t[:topic_id].in(recent_unique_comments)) \
                         .order('comments.topic_id, comments.created_at DESC') \
                         .group_by(&:topic_id).to_a.map{|hsh| hsh[1][0]}
like image 27
Jonas Elfström Avatar answered Oct 17 '22 15:10

Jonas Elfström


In order to accomplish this you need to have a scope with a GROUP BY to get the latest comment for each topic. You can then order this scope by created_at to get the most recent commented on topics.

The following works for me using sqlite

class Comment < ActiveRecord::Base

  belongs_to :topic

  scope :recent, order("comments.created_at DESC")
  scope :latest_by_topic, group("comments.topic_id").order("comments.created_at DESC")
end


class Topic < ActiveRecord::Base
  has_many :comments

  scope :hot, joins(:comments) & Comment.latest_by_topic & limit(5)
end

I used the following seeds.rb to generate the test data

(1..10).each do |t|
  topic = Topic.new
  (1..10).each do |c|
    topic.comments.build(:subject => "Comment #{c} for topic #{t}")
  end
  topic.save
end

And the following are the test results

ruby-1.9.2-p0 > Topic.hot.map(&:id)
 => [10, 9, 8, 7, 6] 
ruby-1.9.2-p0 > Topic.first.comments.create(:subject => 'Topic 1 - New comment')
 => #<Comment id: 101, subject: "Topic 1 - New comment", topic_id: 1, content: nil, created_at: "2010-08-26 10:53:34", updated_at: "2010-08-26 10:53:34"> 
ruby-1.9.2-p0 > Topic.hot.map(&:id)
 => [1, 10, 9, 8, 7] 
ruby-1.9.2-p0 > 

The SQL generated for sqlite(reformatted) is extremely simple and I hope Arel would render different SQL for other engines as this would certainly fail in many DB engines as the columns within Topic are not in the "Group by list". If this did present a problem then you could probably overcome it by limiting the selected columns to just comments.topic_id

puts Topic.hot.to_sql
SELECT     "topics".* 
FROM       "topics" 
INNER JOIN "comments" ON "comments"."topic_id" = "topics"."id" 
GROUP BY  comments.topic_id 
ORDER BY  comments.created_at DESC LIMIT 5
like image 3
Steve Weet Avatar answered Oct 17 '22 17:10

Steve Weet


Since the question was about Arel, I thought I'd add this in, since Rails 3.2.1 adds uniq to the QueryMethods:

If you add .uniq to the Arel it adds DISTINCT to the select statement.

e.g. Topic.hot.uniq

Also works in scope:

e.g. scope :hot, joins(:comments).order("comments.created_at DESC").limit(5).uniq

So I would assume that

scope :hot, joins(:comments) & Comment.recent & limit(5) & uniq

should also probably work.

See http://apidock.com/rails/ActiveRecord/QueryMethods/uniq

like image 2
ice cream Avatar answered Oct 17 '22 16:10

ice cream