Find rows with multiple duplicate fields with Active Record, Rails & Postgres

People also ask

How do you find duplicate rows in pandas based on multiple columns?

Find Duplicate Rows based on all columns To find & select the duplicate all rows based on all columns call the Daraframe. duplicate() without any subset argument. It will return a Boolean series with True at the place of each duplicated rows except their first occurrence (default value of keep argument is 'first').

How do I find duplicate rows in a column in R?

We can find the rows with duplicated values in a particular column of an R data frame by using duplicated function inside the subset function. This will return only the duplicate rows based on the column we choose that means the first unique value will not be in the output.

Tested & Working Version

User.select(:first,:email).group(:first,:email).having("count(*) > 1")

Also, this is a little unrelated but handy. If you want to see how times each combination was found, put .size at the end:

User.select(:first,:email).group(:first,:email).having("count(*) > 1").size

and you'll get a result set back that looks like this:

{[nil, nil]=>512,
 ["Joe", "[email protected]"]=>23,
 ["Jim", "[email protected]"]=>36,
 ["John", "[email protected]"]=>21}

Thought that was pretty cool and hadn't seen it before.

Credit to Taryn, this is just a tweaked version of her answer.

That error occurs because POSTGRES requires you to put grouping columns in the SELECT clause.

try:

User.select(:first,:email).group(:first,:email).having("count(*) > 1").all

(note: not tested, you may need to tweak it)

EDITED to remove id column

If you need the full models, try the following (based on @newUserNameHere's answer).

User.where(email: User.select(:email).group(:email).having("count(*) > 1").select(:email))

This will return the rows where the email address of the row is not unique.

I'm not aware of a way to do this over multiple attributes.

Get all duplicates with a single query if you use PostgreSQL:

def duplicated_users
  duplicated_ids = User
    .group(:first, :email)
    .having("COUNT(*) > 1")
    .select('unnest((array_agg("id"))[2:])')

  User.where(id: duplicated_ids)
end

irb> duplicated_users

Related questions
                            
                                Pry: show me the stack
                            
                                Find all records which have a count of an association greater than zero
                            
                                Rails ActionMailer - format sender and recipient name/email address
                            
                                Named routes _path vs _url
                            
                                difference between scope and namespace of ruby-on-rails 3 routing
                            
                                Find the extension of a filename in Ruby
                            
                                How to display unique records from a has_many through relationship?
                            
                                Rails: "Could not find bundler" (2.2.11) required by Gemfile.lock. (Gem::GemNotFoundException)
                            
                                How to add 10 days to current time in Rails
                            
                                RSpec: What is the difference between a feature and a request spec?
                            
                                How do I run a Ruby file in a Rails environment?
                            
                                Saving enum from select in Rails 4.1
                            
                                bundle install returns "Could not locate Gemfile"
                            
                                Skip callbacks on Factory Girl and Rspec
                            
                                ActiveRecord, has_many :through, and Polymorphic Associations
                            
                                Rails: convert UTC DateTime to another time zone
                            
                                Eager load polymorphic
                            
                                ERROR: Error installing capybara-webkit:
                            
                                Is there a way to access method arguments in Ruby?
                            
                                Rails: how do I validate that something is a boolean?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find rows with multiple duplicate fields with Active Record, Rails & Postgres

Tags:

postgresql

ruby-on-rails

activerecord

People also ask

Recent Activity

Donate For Us