What is the best way to find records with duplicate values in a column using ruby and the new Activerecord?
Once you have grouped data you can filter out duplicates by using having clause. Having clause is the counterpart of where clause for aggregation queries. Just remember to provide a temporary name to count() data in order to use them in having clause.
Translating @TuteC into ActiveRecord:
sql = 'SELECT id,
COUNT(id) as quantity
FROM types
GROUP BY name
HAVING quantity > 1'
#=>
Type.select("id, count(id) as quantity")
.group(:name)
.having("quantity > 1")
Here's how I solved it with the AREL helpers, and no custom SQL:
Person.select("COUNT(last_name) as total, last_name")
.group(:last_name)
.having("COUNT(last_name) > 1")
.order(:last_name)
.map{|p| {p.last_name => p.total} }
Really, it's just a nicer way to write the SQL. This finds all records that have duplicate last_name values, and tells you how many and what the last names are in a nice hash.
I was beating my head against this problem with a 2016 stack (Rails 4.2, Ruby 2.2), and got what I wanted with this:
> Model.select([:thing]).group(:thing).having("count(thing) > 1").all.size
=> {"name1"=>5, "name2"=>4, "name3"=>3, "name4"=>2, "name5"=>2}
With custom SQL, this finds types
with same values for name
:
sql = 'SELECT id, COUNT(id) as quantity FROM types
GROUP BY name HAVING quantity > 1'
repeated = ActiveRecord::Base.connection.execute(sql)
In Rails 2.x, select is a private method of AR class. Just use find():
klass.find(:all,
:select => "id, count(the_col) as num",
:conditions => ["extra conditions here"],
:group => 'the_col',
:having => "num > 1")
Here is a solution that extends the other answers to show how to find and iterate through the records grouped by the duplicate field:
duplicate_values = Model.group(:field).having(Model.arel_table[:field].count.gt(1)).count.keys
Model.where(field: duplicate_values).group_by(&:field).each do |value, records|
puts "The records with ids #{records.map(&:id).to_sentence} have field set to #{value}"
end
It seems a shame this has to be done with two queries but this answer confirms this approach.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With