Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveRecord: Alternative to find_in_batches?

I have a query that loads thousands of objects and I want to tame it by using find_in_batches:

Car.includes(:member).where(:engine => "123").find_in_batches(batch_size: 500) ...

According to the docs, I can't have a custom sorting order: http://www.rubydoc.info/docs/rails/4.0.0/ActiveRecord/Batches:find_in_batches

However, I need a custom sort order of created_at DESC. Is there another method to run this query in chunks like it does in find_in_batches so that not so many objects live on the heap at once?

like image 912
bigpotato Avatar asked May 28 '15 14:05

bigpotato


2 Answers

Hm I've been thinking about a solution for this (I'm the person who asked the question). It makes sense that find_in_batches doesn't allow you to have a custom order because lets say you sort by created_at DESC and specify a batch_size of 500. The first loop goes from 1-500, the second loop goes from 501-1000, etc. What if before the 2nd loop occurs, someone inserts a new record into the table? That would be put onto the top of the query results and your results would be shifted 1 to the left and your 2nd loop would have a repeat.

You could argue though that created_at ASC would be safe then, but it's not guaranteed if your app specifies a created_at value.

UPDATE:

I wrote a gem for this problem: https://github.com/EdmundMai/batched_query

Since using it, the average memory of my application has HALVED. I highly suggest anyone having similar issues to check it out! And contribute if you want!

like image 91
bigpotato Avatar answered Nov 20 '22 06:11

bigpotato


The slower manual way to do this, is to do something like this:

count = Cars.includes(:member).where(:engine => "123").count
count = count/500
count += 1 if count%500 > 0
last_id = 0
while count > 0
    ids = Car.includes(:member).where("engine = "123" and id > ?", last_id).order(created_at: :desc).limit(500).ids #which plucks just the ids`   
    cars = Cars.find(ids)
    #cars.each or #cars.update_all
    #do your updating 
    last_id = ids.last
    count -= 1
end 
like image 27
creativereason Avatar answered Nov 20 '22 06:11

creativereason