Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby query with 2 different jointable conditions

I want find all users who have particular LanguagesUsers records. Currently I'm trying to to query a join table, LanguagesUsers for all the instances where a user has 2 of the appropriate languages_id and level

To find where there is just one association, this works:

User.joins(:languages_users).where(languages_users: {language_id: 2, level: 5})

How can I include 2 languages_users in the query, where they are BOTH true? (For example, I want language_id: 2 level: 1 AND language_id: 3 level: 5)

I have tried this, but it returns an error:

User.joins(:languages_users).where(languages_users: {language_id: 2, level: 5} AND languages_users: {language_id: 1, level: 3})

This one returns an empty array, even though there are definitely users who have 2 languages_users that match this criteria:

User.joins(:languages_users).where(languages_users: {language_id: 2, level: 5}).where(languages_users: {language_id: 1, level: 3})

What is the correct format for this? Thanks!!

UPDATE: LanugagesUser is a join table between Languages and Users and has language_id, level, user_id

gem 'sqlite3', group: :development
group :production do
    gem 'pg'
    gem 'rails_12factor'
end

UPDATE 2: Since both answers have suggested an or statement, I do not think I'm correctly communicating this fact:

BOTH RECORDS MUCH BE PRESENT. For example. I want to know what users have a LanguagesUser(language_id: 1, level: 5) AND LanguagesUser(language_id:2, level:1)

like image 408
gwalshington Avatar asked Jan 06 '17 23:01

gwalshington


4 Answers

Did you try:

User.joins(:languages_users).where(languages_users: [{language_id: 2, level: 5}, {language_id: 1, level: 3}])

EDIT: Since you are using ActiveRecord >= 5, you can use the .or syntax. The result is a little verbose, but it should work, and I'm sure you can pretty it up:

User.joins(:languages_users).where(
  languages_users: {language_id: 2, level: 5}).
  or(
    User.joins(:languages_users).where(
      languages_users: {language_id: 1, level: 3}
  )

EDIT 2: I think I understand what you're going for now, more complex than I first thought, but I think you are going to need a nested query. If you first query the languages_users table and group them by user_id, you can apply a HAVING clause on that to find those users that have both languages. Then you select the user_ids from that query and use them in a simple query on the users table. Since you are using postgres, the following may work.

languages = [{ language_id: 2, level: 5 }, { language_id: 1, level: 3 }]
User.where(users_with_all_languages(languages)

def users_with_all_languages(languages)
  LanguagesUser.
    select(:user_id).
    group(:user_id).
    having('array_agg(language_id) @> ARRAY[?] AND array_agg(level) @> ARRAY[?]',
      languages.map { |x| x[:language_id] },
      languages.map { |x| x[:level] }
    )
end

Or, more simply if you have flexibility with the format of the language data:

language_ids = [2, 1]
language_levels = [5, 3]
User.where(users_with_all_languages(language_ids, language_levels)

def users_with_all_languages(ids, levels)
  LanguagesUser.
    select(:user_id).
    group(:user_id).
    having('array_agg(language_id) @> ARRAY[?] AND array_agg(level) @> ARRAY[?]', ids, levels)
end

I have done something similar, but not with two conditions on the join table, so ANDing the two array_agg conditions is the only wildcard I think...

like image 148
soupdog Avatar answered Nov 16 '22 00:11

soupdog


2 solutions (note: rails does not have support for OR queries)

1. Simple SELECT WHERE

Use a long line of code like this:

User.
  joins(:languages_users).
  where('(`languages_users`.`language_id` = ? AND `languages_users`.`level` = ?) ' + 
        'OR (`languages_users`.`language_id` = ? AND `languages_users`.`level` = ?)', 
        2, 1, 5, 3)

2. 2 liner WHERE SELECT

First you prepare a query for selecting a list of languages_users.ids

languages_users_ids = LanguagesUser.
  select(:id).
  where('(language_id = ? AND level = ?) OR (language_id = ? AND level = ?)', 
        2, 1, 5, 3)

Then you use it within a query to your users table:

User.joins(:languages_users).where(languages_users: {id: languages_users_ids})

This will produce a single "WHERE SELECT" request to your DB server:

SELECT * FROM users JOIN languages_users 
  WHERE languages_users.id IN (
    SELECT id FROM languages_users  
    WHERE (language_id = 2 AND level = 1) OR (language_id = 5 AND level = 3)
    )
like image 32
Benj Avatar answered Nov 16 '22 01:11

Benj


Let's go step by step The first condition is to have language_id: 2, level: 5 So we can have a query just for this condition

users_first_condition = LanguagesUser.where(language_id: 2, level: 5).pluck(:user_id)

Then we can do the same for the second condition {language_id: 1, level: 3}

users_second_condition = LanguagesUser.where(language_id: 1, level: 3).pluck(:user_id)

Since we have the user_ids for both conditions we can intersect both arrays and run a new query:

User.where(id: users_first_condition & users_second_condition)

If we want to prevent to run multiple queries, we can use sub-queries, so, instead of get only id's from each conditions, we can pass directly the relation to the Users' query:

users_first_condtion = Languagesuser.select(:user_id).where(language_id: 2, level: 5)
users_second_condition = Languageuser.select(:user_id).where(language_id: 1, level: 3)
User.where(id: users_first_condition).where(id: users_second_condition)
like image 37
Aguardientico Avatar answered Nov 16 '22 02:11

Aguardientico


try this

User.joins('LEFT OUTER JOIN languages_users lu1 ON lu1.user_id = users.id AND lu1.language_id = 2 AND lu1.level = 5')
    .joins('LEFT OUTER JOIN languages_users lu2 ON lu2.user_id = users.id AND lu2.language_id = 1 AND lu2.level = 3')
    .group('users.id')

I don't have postgres setup so I quickly tested this in my mysql setup.

When coming up with a more complex query, it's good to start thinking in raw sql (at least for me), then see if you can convert that into a AR query.

What you are trying to accomplish is like this is raw sql.

SELECT *
FROM users
LEFT OUTER JOIN
  languages_users lu1 ON users.id = lu1.user_id AND lu1.language_id = 2 AND lu1.level = 5
LEFT OUTER JOIN
  languages_users lu2 ON users.id = lu2.user_id AND lu2.language_id = 1 AND lu1.level = 3
GROUP BY
  users.id

This way, you will construct a table like this:

users.id | users.* | lu1.user_id | lu1.language_id | lu1.level | lu2.user_id | lu2.language_id | lu2.level
   1        ...            1             2                5            1            1                3

The two where clauses with one join wouldn't work because you are setting up two conditions to be met by languages_users table in one row. But it will always only satisfies one or less of them.

Edit

So this goes further in terms of performance. I assume you have proper index on languages_users table (you will need to index the user_id, and combination of language_id and level.

But I've tested on a similar situation with 90m rows in the equivalent of your language_users, and 13m rows in the equivalent of your users table. Either the query in my answer and the subquery answer would take really really long.

So using the subquery, you will put a lot of loads in your ruby server because you will fire individual queries to fetch all the user_ids from two subqueries and use them to construct the final query to fetch user record.

I am not able to test the performance of my join query either because it wouldn't return me before timing out.

like image 1
Edmund Lee Avatar answered Nov 16 '22 02:11

Edmund Lee