I have a Rails 5 app. I have a table filled with URL data that is pulled in from various sources:
id url
1 http://google.com
2 http://yahoo.com
3 http://msn.com
4 http://google.com
5 http://yahoo.com
6 http://askjeeves.com
How can I remove the duplicates from this table?
SQL solution without loops:
Model.where.not(id: Model.group(:url).select("min(id)")).destroy_all
OR
Model.where.not(id: Model.group(:url).select("min(id)")).delete_all
OR
dup_ids = Model.group(:url).select("min(id)").collect{|m| m['min(id)']}
Model.where.not(id: dup_ids).delete_all
#Model.where.not(id: dup_ids).destroy_all
This will delete all duplicates keeping records with minimum id for duplicate records.
You can group by url, leave one and delete duplicates:
Model.all.group(:url).values.each do |dup|
dup.pop #leave one
dup.each(&:destroy) #destroy other
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With