I know that Mongoid v3+ supports batch inserts via Model.collection.insert()
. However, I don't think it supports batch updates where the attributes differ for each record (so I don't think update_all
would work either). Is there a way to do batch update/upsert instead of a single-record lookup & update?
Here's a simplified example in which I have 2 models:
class Product
...
has_and_belongs_to_many :lists
end
class List
...
has_and_belongs_to_many :products
end
When a new Product
is created, I associate it with one or more Lists
. However, I also need to update the Product
attributes daily without losing the List
reference info (I'm ok that validation on Product
doesn't run).
One way, without using batch, is to call find_or_initialize_by
on Product
and update the attributes. But doing so for 10K-1M+ records is very time consuming.
Another way, which uses batch insert, is to do Product.delete_all
and then Product.collection.insert(...)
, but that creates new product_ids
and the relationship with List
is no longer maintained.
Is there a way to do batch update or upsert in this example?
MongoDB 2.6 supports the update
command, see http://docs.mongodb.org/manual/reference/command/update/
Mongoid.default_session
gives you corresponding access to the command
method at the MongoDB database level.
The following example shows how to construct and issue a bulk update command that should answer your question.
The output shows the 10 lists,
comparison of individual versus bulk-update run times,
a dump of the Product collection showing updated stock counts,
and the structure of the bulk update command.
Even in this simple demo, there is significant performance improvement with bulk updates versus individual updates.
The Ruby 'mongo' 1.x driver supports a fluent API for bulk writes, this will also be supported by the new Ruby 'mongo' 2.0 driver that will eventually find its way into Mongoid. Mongoid v3+ currently uses the Moped driver which does not have a fluent API for bulk write. But as demonstrated here, it is fairly straightforward to construct and directly issue a bulk write (bulk update) command in Mongoid/Moped via the Session#command method.
Hope that this helps.
app/models/product.rb
class Product
include Mongoid::Document
field :name, type: String
field :stock, type: Integer, default: 0
has_and_belongs_to_many :lists, index: true
end
app/models/list.rb
class List
include Mongoid::Document
field :name, type: String
has_and_belongs_to_many :products
end
test/unit/product_test.rb
require 'test_helper'
require 'benchmark'
require 'pp'
class ProductTest < ActiveSupport::TestCase
def setup
@session = Mongoid.default_session
@session.drop
end
test '0. mongoid version' do
puts "\nMongoid::VERSION:#{Mongoid::VERSION}\nMoped::VERSION:#{Moped::VERSION}"
end
def individual_updates(list_ids, repetitions)
repetitions.times do
list_ids.each_with_index do |list_id, index|
Product.where(:list_ids => list_id).update_all({'$inc' => {'stock' => index}})
end
end
end
def bulk_command(list_ids, repetitions)
updates = []
repetitions.times do
list_ids.each_with_index do |list_id, index|
updates << {'q' => {'list_ids' => list_id}, 'u' => {'$inc' => {'stock' => index}}, 'multi' => true}
end
end
{ update: Product.collection_name.to_s, updates: updates, ordered: false }
end
def bulk_updates(list_ids, repetitions)
@session.command(bulk_command(list_ids, repetitions))
end
test 'x' do
puts
[
['ASUS MeMO Pad FHD 10', ['ASUS', 'tablet', 'Android']],
['Apple iPad Air Wi-Fi + Cellular 128GB - Silver', ['Apple', 'tablet', 'iOS']],
['Apple iPad mini with Retina display Wi-Fi + Cellular 128GB - Silver', ['Apple', 'tablet', 'iOS']],
['Apple iPhone 5c 32GB Green', ['Apple', 'phone', 'iOS']],
['Apple iPhone 5s 64GB Space Gray', ['Apple', 'phone', 'iOS']],
['LG G Pad 8.3 Tablet', ['LG', 'tablet', 'Android']],
['LG Google Nexus 5 White', ['LG', 'phone', 'Android']],
['Microsoft Surface 7ZR-00001', ['Microsoft', 'tablet', 'Windows 8 RT']],
['Samsung Galaxy S4 I9500', ['Samsung', 'phone', 'Android']],
['Samsung Galaxy Tab S 8.4', ['Samsung', 'tablet', 'Android']]
] .each do |product_name, list_names|
product = Product.create(name: product_name)
list_names.each do |list_name|
list = List.where(name: list_name).first_or_create
list.products << product
end
end
list_names = List.all.to_a.collect(&:name).sort.uniq
p list_names
list_ids = list_names.collect{|list_name| List.where(name: list_name).first.id}
assert(list_ids.count > 0)
Benchmark.bm(20) do |x|
x.report('individual updates') { individual_updates(list_ids, 100) }
x.report('bulk updates') { bulk_updates(list_ids, 100) }
end
pp Product.all.to_a
db_command = bulk_command(list_ids, 1)
assert(db_command[:updates].size > 0)
pp db_command
end
end
rake test
Run options:
# Running tests:
[1/2] ProductTest#test_0._mongoid_version
Mongoid::VERSION:3.1.6
Moped::VERSION:1.5.2
[2/2] ProductTest#test_x
["ASUS", "Android", "Apple", "LG", "Microsoft", "Samsung", "Windows 8 RT", "iOS", "phone", "tablet"]
user system total real
individual updates 0.420000 0.070000 0.490000 ( 0.489579)
bulk updates 0.060000 0.000000 0.060000 ( 0.180926)
[#<Product _id: 5408b72b7f11bad1ca000001, name: "ASUS MeMO Pad FHD 10", stock: 2000, list_ids: ["5408b72c7f11bad1ca000002", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000004"]>,
#<Product _id: 5408b72c7f11bad1ca000005, name: "Apple iPad Air Wi-Fi + Cellular 128GB - Silver", stock: 3600, list_ids: ["5408b72c7f11bad1ca000006", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000007"]>,
#<Product _id: 5408b72c7f11bad1ca000008, name: "Apple iPad mini with Retina display Wi-Fi + Cellular 128GB - Silver", stock: 3600, list_ids: ["5408b72c7f11bad1ca000006", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000007"]>,
#<Product _id: 5408b72c7f11bad1ca000009, name: "Apple iPhone 5c 32GB Green", stock: 3400, list_ids: ["5408b72c7f11bad1ca000006", "5408b72c7f11bad1ca00000a", "5408b72c7f11bad1ca000007"]>,
#<Product _id: 5408b72c7f11bad1ca00000b, name: "Apple iPhone 5s 64GB Space Gray", stock: 3400, list_ids: ["5408b72c7f11bad1ca000006", "5408b72c7f11bad1ca00000a", "5408b72c7f11bad1ca000007"]>,
#<Product _id: 5408b72c7f11bad1ca00000c, name: "LG G Pad 8.3 Tablet", stock: 2600, list_ids: ["5408b72c7f11bad1ca00000d", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000004"]>,
#<Product _id: 5408b72c7f11bad1ca00000e, name: "LG Google Nexus 5 White", stock: 2400, list_ids: ["5408b72c7f11bad1ca00000d", "5408b72c7f11bad1ca00000a", "5408b72c7f11bad1ca000004"]>,
#<Product _id: 5408b72c7f11bad1ca00000f, name: "Microsoft Surface 7ZR-00001", stock: 3800, list_ids: ["5408b72c7f11bad1ca000010", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000011"]>,
#<Product _id: 5408b72c7f11bad1ca000012, name: "Samsung Galaxy S4 I9500", stock: 2800, list_ids: ["5408b72c7f11bad1ca000013", "5408b72c7f11bad1ca00000a", "5408b72c7f11bad1ca000004"]>,
#<Product _id: 5408b72c7f11bad1ca000014, name: "Samsung Galaxy Tab S 8.4", stock: 3000, list_ids: ["5408b72c7f11bad1ca000013", "5408b72c7f11bad1ca000003", "5408b72c7f11bad1ca000004"]>]
{:update=>"products",
:updates=>
[{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000002"},
"u"=>{"$inc"=>{"stock"=>0}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000004"},
"u"=>{"$inc"=>{"stock"=>1}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000006"},
"u"=>{"$inc"=>{"stock"=>2}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca00000d"},
"u"=>{"$inc"=>{"stock"=>3}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000010"},
"u"=>{"$inc"=>{"stock"=>4}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000013"},
"u"=>{"$inc"=>{"stock"=>5}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000011"},
"u"=>{"$inc"=>{"stock"=>6}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000007"},
"u"=>{"$inc"=>{"stock"=>7}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca00000a"},
"u"=>{"$inc"=>{"stock"=>8}},
"multi"=>true},
{"q"=>{"list_ids"=>"5408b72c7f11bad1ca000003"},
"u"=>{"$inc"=>{"stock"=>9}},
"multi"=>true}],
:ordered=>false}
Finished tests in 1.334821s, 1.4983 tests/s, 1.4983 assertions/s.
2 tests, 2 assertions, 0 failures, 0 errors, 0 skips
Probably Gary's answer worked. But I found in mongoid 5.x and 6.x which uses the standard mongodb ruby driver, it is possible to do Model.mongo_client.bulk_write([op1, op2, ...], options)
. See
Needed both documents to make sense of constructing the call. e.g
update = TestCase.collection.bulk_write(
[
replace_one: {
filter: {},
update: {},
upsert: true
},
update_one: { ... },
...
],
ordered: false
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With