I'm playing with Ruby EventMachines for some time now and I think I'm understandings its basics.
However, I am not sure how to read in a large file (120 MB) performantly. My goal is to read a file line by line and write every line into a Cassandra database (same should be with MySQL, PostgreSQL, MongoDB etc. because the Cassandra client supports EM explicitly). The simple snippet blocks the reactor, right?
require 'rubygems'
require 'cassandra'
require 'thrift_client/event_machine'
EM.run do
Fiber.new do
rm = Cassandra.new('RankMetrics', "127.0.0.1:9160", :transport => Thrift::EventMachineTransport, :transport_wrapper => nil)
rm.clear_keyspace!
begin
file = File.new("us_100000.txt", "r")
while (line = file.gets)
rm.insert(:Domains, "#{line.downcase}", {'domain' => "#{line}"})
end
file.close
rescue => err
puts "Exception: #{err}"
err
end
EM.stop
end.resume
end
But what's the right way to get a file read asynchronously?
There is no asynchronous file IO support in EventMachine, the best way to achieve what you're trying to do is to read a couple of lines on each tick and send them off to the database. The most important is to not read too large chunks since that would block the reactor.
EM.run do
io = File.open('path/to/file')
read_chunk = proc do
lines_sent = 10
10.times do
if line = io.gets
send_to_db(line) do
# when the DB call is done
lines_sent -= 1
EM.next_tick(read_chunk) if lines_sent == 0
end
else
EM.stop
end
end
end
EM.next_tick(read_chunk)
end
See What is the best way to read files in an EventMachine-based app?
If you haven't already, you might take a look at EM::FileStreamer. For one thing, FileStreamer uses a C++ based 'fast file reader'. Couldn't you stream the file over a local socket/pipe and handle the sending to db in a separate process that's listening on the other end?
Also there is a non-Fiber based example of handling sync db connections gracefully in ThreadedResource, in case that's helpful...specifically mentions Cassandra. Although it sounds like your Cassandra library is Fiber based.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With