I have an STI-based model called Buyable, with two models Basket and Item. The attributes of concern here for Buyable are:
There's a parent-child relationship between Basket and Item. parent_id is always nil for basket, but an item can belong to a basket by referencing the unique basket id. So basket has_many items, and an item belongs_to a basket.
I need a method on the basket model that:
Returns true of false if there are any other baskets in the table with both the same number of and types of items. Items are considered to be the same type when they share the same shop_week_id and location_id.
For ex:
Given a basket (uid = 7) with 2 items:
item #1
item #2
Return true if there are any other baskets in the table that contain exactly 2 items, with one item having a shop_week_id = 13 and location_id = 103 and the other having a shop_week_id = 13 and location_id = 204. Otherwise return false.
How would you approach this problem? This goes without saying, but I am looking for a very efficient solution.
The following SQL seems to do the trick
big_query = "
SELECT EXISTS (
SELECT 1
FROM buyables b1
JOIN buyables b2
ON b1.shop_week_id = b2.shop_week_id
AND b1.location_id = b2.location_id
WHERE
b1.parent_id != %1$d
AND b2.parent_id = %1$d
AND b1.type = 'Item'
AND b2.type = 'Item'
GROUP BY b1.parent_id
HAVING COUNT(*) = ( SELECT COUNT(*) FROM buyables WHERE parent_id = %1$d AND type = 'Item' )
)
"
With ActiveRecord, you can get this result using select_value:
class Basket < Buyable
def has_duplicate
!!connection.select_value( big_query % id )
end
end
I am not so sure about performance however
If you want to make this as efficient as possible, you should consider creating a hash that encodes basket contents as a single string or blob, add a new column containing the hash (which will need to be updated every time the basket contents change, either by the app or using a trigger), and compare hash values to determine possible equality. Then you might need to perform further comparisons (as described above) in order
What should you use for a hash though? If you know that the baskets will be limited in size, and the ids in question are bounded integers, you should be able to hash to a string that is enough in itself to test for equality. For example, you could base64 encode each shop_week and location, concatenate with a separator not in base64 (like "|"), and then concatenate with the other basket items. Build an index on the new hash key, and comparisons will be fast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With