Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Ruby uniq preserve ordering?

Tags:

ruby

The documentation doesn't say anything about that (http://www.ruby-doc.org/core-2.2.0/Array.html#method-i-uniq).

Also, is it using a naive O(n^2) search or something else like a hashmap ? In the latter case, should I understand that my elements must have a proper implementation of hash and eql? when I want to unicize them ?

like image 368
Frédéric Terrazzoni Avatar asked Jan 23 '15 16:01

Frédéric Terrazzoni


People also ask

Does Ruby Uniq preserve order?

Hence the order is kept. Hash insertion order is preserved since ruby 1.9 and upward.

How does Ruby uniq work?

When you call uniq , it works by making a hash out of your array elements. Every element becomes a key in the hash. Because hash keys are unique, we can get a list of all the keys in the hash, this list then becomes our new array with unique elements.

Is set ordered in Ruby?

keys . Because the Ruby 1.9 Set library happens to be built upon a Hash currently, you can currently use Set as an ordered set.


1 Answers

Given the code (in C) for the Array#uniq

rb_ary_uniq(VALUE ary)
{
    VALUE hash, uniq, v;
    long i;

    if (RARRAY_LEN(ary) <= 1)
        return rb_ary_dup(ary);
    if (rb_block_given_p()) {
        hash = ary_make_hash_by(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        st_foreach(RHASH_TBL(hash), push_value, uniq);
    }
    else {
        hash = ary_make_hash(ary);
        uniq = ary_new(rb_obj_class(ary), RHASH_SIZE(hash));
        for (i=0; i<RARRAY_LEN(ary); i++) {
            st_data_t vv = (st_data_t)(v = rb_ary_elt(ary, i));
            if (st_delete(RHASH_TBL(hash), &vv, 0)) {
                rb_ary_push(uniq, v);
            }
        }
    }
    ary_recycle_hash(hash);

    return uniq;
}

In the general case (the else block), it creates a hash from the array (that unifies the key without keeping the order). Then it create a new empty array with the right size. Finally it go through the first Array and when it finds the key in the hash it delete that key and push it to the empty array.

Hence the order is kept.

I'd say the complexity is O(complexity(ary_make_hash) + N) in time, which is probably O(N)

like image 150
astreal Avatar answered Oct 12 '22 23:10

astreal