What is the easiest and fastest way to achieve a clause where all elements in an array must be matched - not only one when using IN
? After all it should behave like mongodb's $all.
Thinking about group conversations where conversation_users is a join table between conversation_id and user_id I have something like this in mind:
WHERE (conversations_users.user_id ALL IN (1,2))
UPDATE 16.07.12
Adding more info about schema and case:
The join-table is rather simple:
Table "public.conversations_users" Column | Type | Modifiers | Storage | Description -----------------+---------+-----------+---------+------------- conversation_id | integer | | plain | user_id | integer | | plain |
A conversation has many users and a user belongs to many conversations. In order to find all users in a conversation I am using this join table.
In the end I am trying to figure out a ruby on rails scope
that find's me a conversation depending on it's participants - e.g.:
scope :between, ->(*users) { joins(:users).where('conversations_users.user_id all in (?)', users.map(&:id)) }
UPDATE 23.07.12
My question is about finding an exact match of people. Therefore:
Conversation between (1,2,3)
won't match if querying for (1,2)
Array can be passed in WHERE clause of sql query and in PLSQL as well.
To query if the array field contains at least one element with the specified value, use the filter { <field>: <value> } where <value> is the element value. To specify conditions on the elements in the array field, use query operators in the query filter document: { <array field>: { <operator1>: <value1>, ... } }
Assuming the join table follows good practice and has a unique compound key defined, i.e. a constraint to prevent duplicate rows, then something like the following simple query should do.
select conversation_id from conversations_users where user_id in (1, 2) group by conversation_id having count(*) = 2
It's important to note that the number 2 at the end is the length of the list of user_ids. That obviously needs to change if the user_id list changes length. If you can't assume your join table doesn't contain duplicates, change "count(*)" to "count(distinct user_id)" at some possible cost in performance.
This query finds all conversations that include all the specified users even if the conversation also includes additional users.
If you want only conversations with exactly the specified set of users, one approach is to use a nested subquery in the where clause as below. Note, first and last lines are the same as the original query, only the middle two lines are new.
select conversation_id from conversations_users where user_id in (1, 2) and conversation_id not in (select conversation_id from conversations_users where user_id not in (1,2)) group by conversation_id having count(*) = 2
Equivalently, you can use a set difference operator if your database supports it. Here is an example in Oracle syntax. (For Postgres or DB2, change the keyword "minus" to "except.)
select conversation_id from conversations_users where user_id in (1, 2) group by conversation_id having count(*) = 2 minus select conversation_id from conversations_users where user_id not in (1,2)
A good query optimizer should treat the last two variations identically, but check with your particular database to be sure. For example, the Oracle 11GR2 query plan sorts the two sets of conversation ids before applying the minus operator, but skips the sort step for the last query. So either query plan could be faster depending on multiple factors such as the number of rows, cores, cache, indices etc.
I'm collapsing those users into an array. I'm also using a CTE (the thing in the WITH clause) to make this more readable.
=> select * from conversations_users ; conversation_id | user_id -----------------+--------- 1 | 1 1 | 2 2 | 1 2 | 3 3 | 1 3 | 2 (6 rows) => WITH users_on_conversation AS ( SELECT conversation_id, array_agg(user_id) as users FROM conversations_users WHERE user_id in (1, 2) --filter here for performance GROUP BY conversation_id ) SELECT * FROM users_on_conversation WHERE users @> array[1, 2]; conversation_id | users -----------------+------- 1 | {1,2} 3 | {1,2} (2 rows)
EDIT (Some resources)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With