My Ruby query is really slow. I'm looking for a faster way to select items that belong to two arrays
listA
is a list of objectA with the attribute name
listB
is a list of objectB with the attribute name
There can be multiple items in listA
that all have the same name
, but nothing in listB
with the same name
. There are also items in listA
whose name
s are not in listB
. I want a function (loop) to return all the items in listA
where their name
is in listB
.
This is my current solution, but it's too slow. How can I make it faster ?
z = 0
new = []
while z < listB.length
hold = listA.select{|obj| obj.name == listB[z].name}
new += hold
z += 1
end
You want to maximise lookup speed. The best speed you can reach for lookup is O(1) using Hashes. My method described below is 1000x faster than your code and 350x faster than a loop with include?
for n=10000
EDIT 2 : This link explains more in detail why hash lookup is fast.
names = {}
listB.each{|item| names[item.name] = true}
result = listA.select{|item| names[item.name]}
EDIT : Updates to the benchmark.
Benchmark Code
class MyObject
def initialize(name)
@name = name
end
def name
@name
end
end
n=10000
k=n/10
listA = (1..n).map{ MyObject.new(('a'..'z').to_a.shuffle[0,8].join)}
listB = listA.sample(k)
listB += (1..(n-k)).map{ MyObject.new(('a'..'z').to_a.shuffle[0,8].join)}
require 'benchmark'
require 'set'
Benchmark.bm do |x|
x.report("Hash") do
names = {}
listB.each{|item| names[item.name] = true}
result = listA.select{|item| names[item.name]}
end
x.report("OP's code") do
z = 0
new = []
while z < listB.length
hold = listA.select{|obj| obj.name == listB[z].name}
new += hold
z += 1
end
end
x.report("Include") do
names = listB.map(&:name)
listA.select{|obj| names.include?(obj.name) }
end
x.report("Set") do
names = listB.map(&:name).to_set
listA.select{|item| names.include?(item.name)}
end
end
Benchmark
Specs :
Results :
Algorithm user system total real
Hash 0.015000 0.000000 0.015000 ( 0.013002)
OP's code 26.053000 0.016000 26.069000 ( 26.161283)
Include 9.219000 0.000000 9.219000 ( 9.244161)
Set 0.016000 0.000000 0.016000 ( 0.013001)
Something like:
names = listB.map(&:name)
listA.select {|obj| names.inlude? obj.name }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With