Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I query mongodb using mongoid/rails without timing out?

I have a rake task that processes a set of records and saves it in another collection:

batch = [] 

Record.where(:type => 'a').each do |r| 
  batch <<  make_score(r)

  if batch.size %100 == 0 
    Score.collection.insert(batch) 
    batch = [] 
  end 
end 

I'm processing about 100K records at a time. Unfortunately at 20 minutes, I get a Query response returned CURSOR_NOT_FOUND error.

The mongodb faq says to use skip and limit or turn off timeouts, using them the all thing was about ~2-3 times slower.

How can I turn off timeouts in conjunction with mongoid?

like image 335
tommy chheng Avatar asked Oct 25 '10 21:10

tommy chheng


4 Answers

The MongoDB docs say you can pass in a timeout boolean, and it timeout is false, it will never timeout

collection.find({"type" => "a"}, {:timeout=>false})

In your case:

Record.collection.find({:type=>'a'}, :timeout => false).each ...

I also recommend you look into map-reduced with Mongo. It seems tailer made to this sort of collection array manipulation: http://www.mongodb.org/display/DOCS/MapReduce

like image 79
Jesse Wolgamott Avatar answered Sep 21 '22 17:09

Jesse Wolgamott


In mongoid 3 you can use this:

ModelName.all.no_timeout.each do |m|
   "do something with model"
end

Which is pretty handy.

like image 8
Quentin Avatar answered Sep 23 '22 17:09

Quentin


It does seem, for now at least, you have to go the long route and query via the Mongo driver:

Mongoid.database[collection.name].find({ a_query }, { :timeout => false }) do |cursor| 
  cursor.each do |row| 
    do_stuff 
  end 
end
like image 6
Hakan Ensari Avatar answered Sep 23 '22 17:09

Hakan Ensari


Here is the workaround I did. Create an array to hold the full records, and work from that array like this

products = []

Product.all.each do |p|
products << p
end

products.each do |p|
# Do your magic
end

dumping all records into the array will most likely finish within before the timeout, unless you are working on extremely large number of records. Also, this is going to consume too much memory in case you are dealing with large or too many records as well, so keep in that mind.

like image 1
Bashar Abdullah Avatar answered Sep 19 '22 17:09

Bashar Abdullah