Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find keep duplicates in Ruby hashes

Tags:

arrays

ruby

hash

I have an array of hashes where I need to find and store matches based on one matching value between the hashes.

a = [{:id => 1, :name => "Jim", :email => "[email protected]"}, 
     {:id => 2, :name => "Paul", :email => "[email protected]"}, 
     {:id => 3, :name => "Tom", :email => "[email protected]"}, 
     {:id => 1, :name => "Jim", :email => "[email protected]"}, 
     {:id => 5, :name => "Tom", :email => "[email protected]"}, 
     {:id => 6, :name => "Jim", :email => "[email protected]"}]

So I would want to return

b = [{:id => 1, :name => "Jim", :email => "[email protected]"},  
     {:id => 3, :name => "Tom", :email => "[email protected]"}, 
     {:id => 5, :name => "Tom", :email => "[email protected]"}, 
     {:id => 6, :name => "Jim", :email => "[email protected]"}]

Notes: I can sort the data (csv) by :name after the fact so they don't have to be nicely grouped, just accurate. Also it's not necessary two of the same, it could be 3 or 10 or more.

Also, the data is about 22,000 rows.

like image 764
lyonsinbeta Avatar asked Aug 14 '13 02:08

lyonsinbeta


1 Answers

I tested this and it will do exactly what you want:

b = a.group_by { |h| h[:name] }.values.select { |a| a.size > 1 }.flatten

However, you might want to look at some of the intermediate objects produced in that calculation and see if those are more useful to you.

like image 92
David Grayson Avatar answered Oct 12 '22 17:10

David Grayson