In my specific case, at least. Not trying to make general statements here.
I've got this web crawler that I wrote in Node.js. I'd love to use Ruby instead, so I re-wrote it in EventMachine. Since the original was in CoffeeScript, it was actually surprisingly easy, and the code is very much the same, except that in EventMachine I can actually trap and recover from exceptions (since I'm using fibers).
The problem is that tests that run in under 20 seconds on the Node.js code take up to and over 5 minutes on EventMachine. When I watch the connection count it almost looks like they are not even running in parallel (they queue up into the hundreds, then very slowly work their way down), though logging shows that the code points are hit in parallel.
I realize that without code you can't really know what exactly is going on, but I was just wondering if there is some kind of underlying difference and I should give up, or if they really should be able to run about as fast (a small slowdown is fine) and I should keep trying to figure out what the issue is.
I did the following, but it didn't really seem to have any effect:
puts "Running with ulimit: " + EM.set_descriptor_table_size(60000).to_s
EM.set_effective_user('nobody')
EM.kqueue
Oh, and I'm very sure that I don't have any blocking calls in EventMachine. I've combed through every line about 10 times looking for anything that could be blocking. All my network calls are EM::HttpRequest.
The problem is that tests that run in under 20 seconds on the Node.js code take up to and over 5 minutes on EventMachine. When I watch the connection count it almost looks like they are not even running in parallel (they queue up into the hundreds, then very slowly work their way down), though logging shows that the code points are hit in parallel.
If they're not running in parallel then it's not asynchronous. So you're blocking.
Basically you need to figure out what blocking IO call you've made in the standard Ruby library and remove that and replace it with an EventMachine non blocking IO call.
Your code may not have any blocking calls but are you using 3rd party code that is not your own or not from EM
? They may block. Even something as simple as a debug print / log can block.
All my network calls are EM::HttpRequest.
What about file IO, what about TCP ? What about anything else that can block. What about 3rd party libraries.
We really need to see some code here. Either to identify a bottle neck in your code or a blocking call.
node.js should not be more than an order of magnitude faster then EM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With