Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pure-Ruby concurrent Hash

What's the best way to implement a Hash that can be modified across multiple threads, but with the smallest number of locks. For the purposes of this question, you can assume that the Hash will be read-heavy. It must be thread-safe in all Ruby implementations, including ones that operate in a truly simultaneous fashion, such as JRuby, and it must be written in pure-Ruby (no C or Java allowed).

Feel free to submit a naïve solution that always locks, but that isn't likely to be the best solution. Points for elegance, but a smaller likelihood of locking wins over smaller code.

like image 606
Yehuda Katz Avatar asked Jul 03 '09 22:07

Yehuda Katz


People also ask

What is concurrent hashing?

A concurrent hash table (concurrent hash map) is an implementation of hash tables allowing concurrent access by multiple threads using a hash function.

What is concurrent Ruby?

Concurrent Ruby is a concurrency toolkit that builds on a lot of interesting ideas from many functional languages and classic concurrency patterns.

Is Ruby hash thread safe?

Unfortunately, Ruby doesn't ship with any thread-safe Array or Hash implementations. The core Array and Hash classes are not thread-safe by default, nor should they be.


1 Answers

Okay, now that you specified the actually meaning of 'threadsafe', here are two potential implementations. The following code will run forever in MRI and JRuby. The lockless implementation follows an eventual consistency model where each thread uses it's own view of the hash if the master is in flux. There is a little trickery required to make sure storing all the information in the thread doesn't leak memory, but that is handled and tested ― process size does not grow running this code. Both implementations would need more work to be 'complete', meaning delete, update, etc. would need some thinking, but either of the two concepts below will meet your requirements.

It's very important for people reading this thread to realize the whole issue is exclusive to JRuby ― in MRI the built-in Hash is sufficient.

module Cash   def Cash.new(*args, &block)     env = ENV['CASH_IMPL']     impl = env ? Cash.const_get(env) : LocklessImpl     klass = defined?(JRUBY_VERSION) ? impl : ::Hash     klass.new(*args)   end    class LocklessImpl     def initialize       @hash = {}     end      def thread_hash       thread = Thread.current       thread[:cash] ||= {}       hash = thread[:cash][thread_key]       if hash         hash       else         hash = thread[:cash][thread_key] = {}         ObjectSpace.define_finalizer(self){ thread[:cash].delete(thread_key) }         hash       end     end      def thread_key       [Thread.current.object_id, object_id]     end      def []=(key, val)       time = Time.now.to_f       tuple = [time, val]       @hash[key] = tuple       thread_hash[key] = tuple       val     end      def [](key)     # check the master value     #       val = @hash[key]      # someone else is either writing the key or it has never been set.  we     # need to invalidate our own copy in either case     #       if val.nil?         thread_val = thread_hash.delete(key)         return(thread_val ? thread_val.last : nil)       end      # check our own thread local value     #       thread_val = thread_hash[key]      # in this case someone else has written a value that we have never seen so     # simply return it     #       if thread_val.nil?         return(val.last)       end      # in this case there is a master *and* a thread local value, if the master     # is newer juke our own cached copy     #       if val.first > thread_val.first         thread_hash.delete(key)         return val.last       else         return thread_val.last       end     end   end    class LockingImpl < ::Hash     require 'sync'      def initialize(*args, &block)       super     ensure       extend Sync_m     end      def sync(*args, &block)       sync_synchronize(*args, &block)     end      def [](key)       sync(:SH){ super }     end      def []=(key, val)       sync(:EX){ super }     end   end end    if $0 == __FILE__   iteration = 0    loop do     n = 42     hash = Cash.new      threads =       Array.new(10) {         Thread.new do           Thread.current.abort_on_exception = true           n.times do |key|             hash[key] = key             raise "#{ key }=nil" if hash[key].nil?           end         end       }      threads.map{|thread| thread.join}      puts "THREADSAFE: #{ iteration += 1 }"   end end 
like image 177
ara t howard Avatar answered Sep 16 '22 21:09

ara t howard