Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are Arrays Sets and SortedSets implemented under the hood in Ruby

Tags:

ruby

Usually Arrays are implemented as a chunks of memory, sets as hash maps, and sorted sets as skip lists. Is that also the case in Ruby?

I'm trying to evaluate the use of different containers in Ruby in terms of performance and memory footprint

like image 627
EugeneMi Avatar asked Nov 05 '14 17:11

EugeneMi


1 Answers

Arrays are part of the Ruby core library. Each Ruby implementation has its own implementation of arrays. The Ruby Language Specification only specifies the behavior of Ruby arrays, it does not specify any particular implementation strategy. It doesn't even specify any performance constraints that would force or at least suggest a particular implementation strategy.

However, most Rubyists have some expectation about the performance characteristics of arrays that would force an implementation that doesn't meet them into obscurity, because nobody would actually use it:

  • inserting, prepending or appending as well as deleting an element has a worst-case step-complexity of O(n) and an amortized worst-case step-complexity of O(1)
  • accessing an element has a worst-case step-complexity of O(1)
  • iterating over all elements has a worst-case step-complexity of O(n)

This means that arrays need to be implemented as dynamic arrays with exponential resizing, otherwise those performance guarantees couldn't be met. You might get away with a very wide and shallow tree, but AFAIK no Ruby implementation does that.

Here's Rubinius's array implementation, which I personally find the easiest to read of all Ruby implementations. (Note: only the bare essentials are implemented in C++, most of the array methods are implemented in Ruby, e.g. in kernel/common/array.rb).

Set and SortedSet are part of the set library in the stdlib. The stdlib is shared mostly unchanged between Ruby implementations (at least the parts that are actually written in Ruby, obviously the parts written in other languages can't be shared), and since Set is written in Ruby, you can expect it to the same on all Ruby implementations.

Set is implemented as a Hash, where only the keys are used, the values are simply always true: see Set#add in lib/set.rb.

SortedSet is backed by a Red-Black-Tree that isn't implemented in Ruby.

like image 69
Jörg W Mittag Avatar answered Nov 15 '22 06:11

Jörg W Mittag