I have a Rails app, which uses postgresql for a database, that sorts different types of users by location, and then by reputation points they receive for various activities on the site. This is an example query
@lawyersbylocation = User.lawyers_by_province(province).sort_by{ |u| -u.total_votes }
The query calls the scope lawyers_by_province on the User.rb model:
scope :lawyers_by_province, lambda {|province|
joins(:contact).
where( contacts: {province_id: province},
users: {lawyer: true})
}
And then, still on the User.rb model, it calculates reputation points they have.
def total_votes
answerkarma = AnswerVote.joins(:answer).where(answers: {user_id: self.id}).sum('value')
contributionkarma = Contribution.where(user_id: self.id).sum('value')
bestanswer = BestAnswer.joins(:answer).where(answers: {user_id: self.id}).sum('value')
answerkarma + contributionkarma + bestanswer
end
I've been told that if the site reaches a certain number of users, then it will become incredibly slow because it's sorting in Ruby rather than at the database level. I know that comment refers to the total_votes method, but I'm not sure if the lawyers_by_province is happening at the database level or in ruby, in that it's using Rails helpers to query the db. Seems like a mix of both to me, but I'm not sure about the effect of that on efficiency.
Can you show me how to write this so that the query is happening at the db level and therefore in a more efficient way that won't break my site?
Update Here are the three schemes for models in total_votes method.
create_table "answer_votes", force: true do |t|
t.integer "answer_id"
t.integer "user_id"
t.integer "value"
t.boolean "lawyervote"
t.boolean "studentvote"
t.datetime "created_at"
t.datetime "updated_at"
end
add_index "answer_votes", ["answer_id"], name: "index_answer_votes_on_answer_id", using: :btree
add_index "answer_votes", ["lawyervote"], name: "index_answer_votes_on_lawyervote", using: :btree
add_index "answer_votes", ["studentvote"], name: "index_answer_votes_on_studentvote", using: :btree
add_index "answer_votes", ["user_id"], name: "index_answer_votes_on_user_id", using: :btree
create_table "best_answers", force: true do |t|
t.integer "answer_id"
t.integer "user_id"
t.integer "value"
t.datetime "created_at"
t.datetime "updated_at"
t.integer "question_id"
end
add_index "best_answers", ["answer_id"], name: "index_best_answers_on_answer_id", using: :btree
add_index "best_answers", ["user_id"], name: "index_best_answers_on_user_id", using: :btree
create_table "contributions", force: true do |t|
t.integer "user_id"
t.integer "answer_id"
t.integer "value"
t.datetime "created_at"
t.datetime "updated_at"
end
add_index "contributions", ["answer_id"], name: "index_contributions_on_answer_id", using: :btree
add_index "contributions", ["user_id"], name: "index_contributions_on_user_id", using: :btree
Also, here is the contact scheme which contains the province_id used in the lawyers_by_province scope on user.rb model
create_table "contacts", force: true do |t|
t.string "firm"
t.string "address"
t.integer "province_id"
t.string "city"
t.string "postalcode"
t.string "mobile"
t.string "office"
t.integer "user_id"
t.string "website"
t.datetime "created_at"
t.datetime "updated_at"
end
Update Trying to apply the answer by @Shawn, I put this method in the user.rb model
def self.total_vote_sql
"( " +
[
AnswerVote.joins(:answer).select("user_id, value"),
Contribution.select("user_id, value"),
BestAnswer.joins(:answer).select("user_id, value")
].map(&:to_sql) * " UNION ALL " +
") as total_votes "
end
and then in the controller, I did this (putting User
in front of total_vote_sql
)
@lawyersbyprovince = User.select("users.*, sum(total_votes.value) as total_votes").joins("left outer join #{User.total_vote_sql} on users.id = total_votes.user_id").
order("total_votes desc").lawyers_by_province(province)
It's giving me this error
ActiveRecord::StatementInvalid in LawyerProfilesController#index
PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id, v... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id, value FROM "contributions" UNION ALL SELECT user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc
Update After applying edits to Shawn's post, the error message is now this:
PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id as... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id as tv_user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id as tv_user_id, value FROM "contributions" UNION ALL SELECT user_id as tv_user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.tv_user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc
Caveat: I'm quite new to Rails, but this is my technique for keeping sane while needing to continually go straight to the database for performance reasons, which I need to do all the time because you can only have two of the following
Anyway, once you need to get into these hybrid methodologies, which are part-ruby part-SQL I feel like you might as well go the whole hog and opt for a pure SQL solution.
I think the SQL that you're looking for here is along the lines of:
with cte_scoring as (
select
users.id,
(select Coalesce(sum(value),0) from answer_votes where answer_votes.user_id = users.id) +
(select Coalesce(sum(value),0) from best_answers where best_answers.user_id = users.id) +
(select Coalesce(sum(value),0) from contributions where contributions.user_id = users.id) total_score
from
users join
contacts on (contacts.user_id = users.id)
where
users.lawyer = 'true' and
contacts.province_id = #{province.id})
select id,
total_score
from cte_scoring
order by total_score desc
limit #{limit_number}
This ought to give you the best possible performance -- the subqueries in the SELECT are not ideal but the technique does apply filtering on which user_id you're checking the score for.
Integrating into Rails: If you define sql_string as the SQL code:
scoring = ActiveRecord::Base.connection.execute sql_string
... then you get an array of hashes back that you work with like this:
scoring.each do |lawyer_score|
lawyer = User.find(lawyer_score["id"])
score = lawyer_score["total_score"]
...
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With