Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails: batched attribute queries using AREL

I'd like to use something like find_in_batches, but instead of grouping fully instantiated AR objects, I would like to group a certain attribute, like, let's say, the id. So, basically, a mixture of using find_in_batches and pluck:

Cars.where(:engine => "Turbo").pluck(:id).find_in_batches do |ids|
  puts ids
end

# [1, 2, 3....]
# ...

Is there a way to do this (maybe with Arel) without having to write the OFFSET/LIMIT logic myself or recurring to pagination gems like will paginate or kaminari?

like image 881
ChuckE Avatar asked Mar 21 '13 15:03

ChuckE


1 Answers

This is not the ideal solution, but here's a method that just copy-pastes most of find_in_batches but yields a relation instead of an array of records (untested) - just monkey-patch it into Relation :

def in_batches( options = {} )
  relation = self

  unless arel.orders.blank? && arel.taken.blank?
    ActiveRecord::Base.logger.warn("Scoped order and limit are ignored, it's forced to be batch order and batch size")
  end

  if (finder_options = options.except(:start, :batch_size)).present?
    raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
    raise "You can't specify a limit, it's forced to be the batch_size"  if options[:limit].present?

    relation = apply_finder_options(finder_options)
  end

  start = options.delete(:start)
  batch_size = options.delete(:batch_size) || 1000

  relation = relation.reorder(batch_order).limit(batch_size)
  relation = start ? relation.where(table[primary_key].gteq(start)) : relation

  while ( size = relation.size ) > 0    

    yield relation

    break if size < batch_size

    primary_key_offset = relation.last.id
    if primary_key_offset
      relation = relation.where(table[primary_key].gt(primary_key_offset))
    else
      raise "Primary key not included in the custom select clause"
    end
  end
end

With this, you should be able to do :

Cars.where(:engine => "Turbo").in_batches do |relation|
  relation.pluck(:id)
end

this is not the best implementation possible (especially in regard to primary_key_offset calculation, which instantiates a record), but you get the spirit.

like image 54
m_x Avatar answered Sep 24 '22 00:09

m_x