Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing Identical Objects in Ruby?

I am writing a Ruby app at the moment which is going to search twitter for various things. One of the problems I am going to face is shared results between searches in close proximity to each other time-wise. The results are returned in an array of objects each of which is a single tweet. I know of the Array.uniq method in ruby which returns an array with all the duplicates removed.

My question is this. Does the uniq method remove duplicates in so far as these objects point to the same space in memory or that they contain identical information?

If the former, whats the best way of removing duplicates from an array based on their content?

like image 602
Patrick O'Doherty Avatar asked Oct 30 '09 15:10

Patrick O'Doherty


People also ask

How to remove Duplicates from Ruby array?

With the uniq method you can remove ALL the duplicate elements from an array. Let's see how it works! Where the number 1 is duplicated. Calling uniq on this array removes the extra ones & returns a NEW array with unique numbers.

What does .uniq do in Ruby?

uniq is a Ruby method that returns a new array by removing duplicate elements or values of the array. The array is traversed in order and the first occurrence is kept.

How do you check if there are duplicates in an array Ruby?

Duplicate elements can be found using two loops. The outer loop will iterate through the array from 0 to length of the array. The outer loop will select an element. If a match is found which means the duplicate element is found then, display the element.


2 Answers

Does the uniq method remove duplicates in so far as these objects point to the same space in memory or that they contain identical information?

The method relies on the eql? method so it removes all the elements where a.eql?(b) returns true. The exact behavior depends on the specific object you are dealing with.

Strings, for example, are considered equal if they contain the same text regardless they share the same memory allocation.

a = b = "foo"
c = "foo"

[a, b, c].uniq
# => ["foo"]

This is true for the most part of core objects but not for ruby objects.

class Foo
end

a = Foo.new
b = Foo.new

a.eql? b
# => false

Ruby encourages you to redefine the == operator depending on your class context.

In your specific case I would suggest to create an object representing a twitter result and implement your comparison logic so that Array.uniq will behave as you expect.

class Result

  attr_accessor :text, :notes

  def initialize(text = nil, notes = nil)
    self.text = text
    self.notes = notes
  end

  def ==(other)
    other.class == self.class &&
    other.text  == self.text
  end
  alias :eql? :==

end

a = Result.new("first")
b = Result.new("first")
c = Result.new("third")

[a, b, c].uniq
# => [a, c]
like image 145
Simone Carletti Avatar answered Sep 30 '22 00:09

Simone Carletti


For anyone else stumbling upon this question, it looks like things have changed a bit since this question was first asked and in newer Ruby versions (1.9.3 at least), Array.uniq assumes that your object also has a meaningful implementation of the #hash method, in addition to .eql? or ==.

like image 42
Eero Helenius Avatar answered Sep 30 '22 00:09

Eero Helenius