Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write (large) files with Ruby Eventmachine

I've spent several days now finding some non-echo-server examples for eventmachine, but there just don't seem to be any. Let's say i want to write a server that accepts a file and writes it to a Tempfile:

require 'rubygems'
require 'tempfile'
require 'eventmachine'

module ExampleServer

  def receive_data(data)
    f = Tempfile.new('random')
    f.write(data)
  ensure
    f.close
  end

end

EventMachine::run {
  EventMachine::start_server "127.0.0.1", 8081, ExampleServer
  puts 'running example server on 8081'
}

Writing to the file would block the reactor, but i don't get how to do it 'Eventmachine style'. Would i have to read the data in chunks and write each chunk to disk within an Em.next_tick block?

Thanks for any help Andreas

like image 725
Andreas Avatar asked Jan 10 '11 10:01

Andreas


3 Answers

Two answers:

Lazy answer: just use a blocking write. EM is already handing you discrete chunks of data, not one gigantic string. So your example implementation may be a bit off. Are you sure you want to make a new tempfile for every single chunk that EM hands you? However, I'll continue on the assumption that your sample code is working as intended.

Admittedly, the lazy approach depends on the device you're writing to, but trying to simultaneously write several large streams to disk at the same is going to be a major bottleneck and you'll lose your advantages of having an event based server anyway. You'll just end up with juggling disk seeks all over the place, IO performance will plummet, and so will your server's performance. Handling many things at once is okay with RAM, but once you start dealing with block devices and IO scheduling, you're going to run into performance bottlenecks no matter what you're doing.

However, I guess you might want to do some long writes to disk at the same time that you want low latency responses to other, non-IO heavy requests. So, perhaps the good answer:

Use defer.

require 'rubygems'
require 'tempfile'
require 'eventmachine'

module ExampleServer

  def receive_data(data)
    operation = proc do
      begin
        f = Tempfile.new('random')
        f.write(data)
      ensure
        f.close
      end
    end

    callback = proc do
      puts "I wrote a file!"
    end

    EM.defer(operation, callback)
  end

end

EventMachine::run {
  EventMachine::start_server "127.0.0.1", 8081, ExampleServer
  puts 'running example server on 8081'
}

Yes, this does use threading. It's really not that bad in this case: you don't have to worry about synchronization between threads, because EM is nice enough to handle this for you. If you need a response, use the callback, which will be executed in the main reactor thread when the worker thread completes. Also, the GIL is something of a non-issue for this case, since you're dealing with IO blocking here, and not trying to achieve CPU concurrency.

But if you did intend to write everything to the same file, you'll have to be careful with defer, since the synchronization issue will arise as your threads will likely attempt to write to the same file at the same time.

like image 140
Fitzsimmons Avatar answered Sep 20 '22 01:09

Fitzsimmons


From the docs, it seems you just need to attach the file (although as you point out, that might not be valid, it seems the option is to use File.write/ie blocking...) and send_data .

Although I thought you couldn't mix blocking/non-blocking IO with EM :(

Given the source data is a socket, I guess that will be handled by EventMachine .

Perhaps a question for the google group...

~chris

like image 30
Chris Kimpton Avatar answered Sep 20 '22 01:09

Chris Kimpton


Unfortunately files don't respond well to select interfaces. If you need something more efficient than IO#write (which is unlikely), then you could use EIO.

EIO will really only lightly unblock the reactor, and provide you with a teeny bit of buffering. If specific latencies are a problem, or you have really slow disks, that might be helpful. In most other cases, it's probably just a bunch of effort for little advantage.

like image 28
raggi Avatar answered Sep 22 '22 01:09

raggi