Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

select all records holding some condition in has_many association - Ruby On Rails

I have a model profile.rb with following association

class User < ActiveRecord::Base
   has_one :profile
end

class Profile < ActiveRecord::Base
    has_many :skills
    belongs_to :user
end

I have a model skills.rb with following association

class Skill < ActiveRecord::Base
    belongs_to :profile
end

I have following entries in skills table

id:         name:           profile_id:
====================================================
1           accounting          1
2           martial arts        2
3           law                 1
4           accounting          2
5           journalist          3
6           administration      1

and so on , how can i query all the profiles with ,lets say, "accounting" & "administration" skills which will be profile with id 1 considering the above recode. so far i have tried following

Profile.includes(:skills).where(skills: {name: ["accounting" , "administration"]} )

but instead of finding profile with id 1 - It gets me [ 1, 2 ] because profile with id 2 holds "accounting" skills and it's performing an "IN" operation in database

Note: I'm using postgresql and question is not only about a specific id of profile as described (which i used only as an example) - The original question is to get all the profiles which contain these two mentioned skills.

My activerecord join fires the following query in postgres

SELECT FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" IN ('Accounting', 'Administration')

In below Vijay Agrawal's answer is something which i already have in my application and both, his and mine, query use IN wildcard which result in profile ids which contain either of skills while my question is to get profile ids which contain both the skills. I'm sure that there must be a way to fix this thing in the same query way which is listed in original question and i'm curious to learn that way . I hope that i'll get some more help with you guys - thanks

For clarity, I want to query all the profiles with multiple skills in a model with has_many relationship with profile model - using the Profile as primary table not the skills

Reason for using Profile as primary table is that in pagination i don't want to get all skills from related table ,say 20_000 or more rows and then filter according to profile.state column . instead anyone would like to select only 5 records which meet the profile.state , profile.user.is_active and other columns condition and match the skills without retrieving thousands of irrelevant records and then filter them again.

like image 704
Mani Avatar asked Jul 28 '17 21:07

Mani


2 Answers

You should do this to get all profile_ids which have both accounting and administration skills :

Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").pluck(:profile_id)

If you need profiles details, you can put this query in where clause of Profile for id.

Note the number 2 in query, it is length of your array used in where clause. In this case ["accounting", "administration"].length

UPDATE::

Based on updated question description, instead of pluck you can use select and add subquery to make sure it happens in one query.

Profile.where(id: Skill.where(name: ["accounting", "administration"]).group(:profile_id).having("count('id') = 2").select(:profile_id))

More over you have control over sorting, pagination and additional where clause. Don't see any concerns over there which are mentioned in question edit.

UPDATE 2::

Another way to get intersect of profiles with both the skills (likely to be less efficient than above solution):

profiles = Profile

["accounting", "administration"].each do |name|
  profiles = profiles.where(id: Skill.where(name: name).select(:profile_id))
end
like image 75
Vijay Agrawal Avatar answered Nov 07 '22 09:11

Vijay Agrawal


Profile.includes(:skills).where("skills.name" => %w(accounting administration))

For more information, read about finding through ActiveRecord associations.

Update

If this is not working for you then you likely do not have your database and models properly configured, because in a brand new Rails app this works as expected.

class CreateProfiles < ActiveRecord::Migration[5.1]
  def change
    create_table :profiles do |t|
      t.timestamps
    end
  end
end

class CreateSkills < ActiveRecord::Migration[5.1]
  def change
    create_table :skills do |t|
      t.string :name
      t.integer :profile_id
      t.timestamps
    end
  end
end

class Profile < ApplicationRecord
  has_many :skills
end

class Skill < ApplicationRecord
  belongs_to :profile
end

Profile.create
Profile.create
Skill.create(name: 'foo', profile_id: 1)
Skill.create(name: 'bar', profile_id: 1)
Skill.create(name: 'baz', profile_id: 2)

Profile.includes(:skills).where("skills.name" => %w(foo))
  SQL (0.3ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'foo' LIMIT ?  [["LIMIT", 11]]
  SQL (0.1ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'foo' AND "profiles"."id" = 1
 => #<ActiveRecord::Relation [#<Profile id: 1, created_at: "2017-07-28 21:52:56", updated_at: "2017-07-28 21:52:56">]>

Profile.includes(:skills).where("skills.name" => %w(bar))
  SQL (0.3ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'bar' LIMIT ?  [["LIMIT", 11]]
  SQL (0.1ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'bar' AND "profiles"."id" = 1
 => #<ActiveRecord::Relation [#<Profile id: 1, created_at: "2017-07-28 21:52:56", updated_at: "2017-07-28 21:52:56">]>

Profile.includes(:skills).where("skills.name" => %w(baz))
  SQL (0.3ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'baz' LIMIT ?  [["LIMIT", 11]]
  SQL (0.1ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "skills" ON "skills"."profile_id" = "profiles"."id" WHERE "skills"."name" = 'baz' AND "profiles"."id" = 2
 => #<ActiveRecord::Relation [#<Profile id: 2, created_at: "2017-07-28 21:53:34", updated_at: "2017-07-28 21:53:34">]>

Update 2

Downvoting an answer because you changed your question later is poor form.

You should change your model relationships from has_many and belongs_to to has_and_belongs_to_many. This will allow you to stop recording a new skill every time; if someone adds the skill administration and then later on someone else adds that skill, you don't have to create a new skill. You just re-use the existing skill and associate it with multiple profiles:

class Profile < ApplicationRecord
  has_and_belongs_to_many :skills
end

class Skill < ApplicationRecord
  has_and_belongs_to_many :profiles
end

Add a join table with a unique index (so each profile can have each skill once and only once):

class Join < ActiveRecord::Migration[5.1]
  def change
    create_table :profiles_skills, id: false do |t|
      t.belongs_to :profile, index: true
      t.belongs_to :skill, index: true
      t.index ["profile_id", "skill_id"], name: "index_profiles_skills_on_profile_id_skill_id", unique: true, using: :btree
    end
  end
end

Create your models:

Profile.create
Profile.create
Skill.create(name: 'foo')
Skill.create(name: 'bar')
Skill.create(name: 'baz')
Profile.first.skills << Skill.first
Profile.first.skills << Skill.second
Profile.second.skills << Skill.second
Profile.second.skills << Skill.third

And then run your query to return just the first profile:

skills = %w(foo bar).uniq
Profile.includes(:skills).where('skills.name' => skills).group(:id).having("count(skills.id) >= #{skills.size}")
  SQL (0.4ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" IN ('foo', 'bar') GROUP BY "profiles"."id" HAVING (count(skills.id) = 2) LIMIT ?  [["LIMIT", 11]]
  SQL (0.2ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" IN ('foo', 'bar') AND "profiles"."id" = 1 GROUP BY "profiles"."id" HAVING (count(skills.id) = 2)
 => #<ActiveRecord::Relation [#<Profile id: 1, created_at: "2017-07-28 21:52:56", updated_at: "2017-07-28 21:52:56">]>

Confirm with additional testing:

Should return both profiles:

skills = %w(bar).uniq
Profile.includes(:skills).where('skills.name' => skills).group(:id).having("count(skills.id) >= #{skills.size}")
  SQL (0.4ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" = 'bar' GROUP BY "profiles"."id" HAVING (count(skills.id) >= 1) LIMIT ?  [["LIMIT", 11]]
  SQL (0.3ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" = 'bar' AND "profiles"."id" IN (1, 2) GROUP BY "profiles"."id" HAVING (count(skills.id) >= 1)
 => #<ActiveRecord::Relation [#<Profile id: 1, created_at: "2017-07-28 21:52:56", updated_at: "2017-07-28 21:52:56">, #<Profile id: 2, created_at: "2017-07-28 21:53:34", updated_at: "2017-07-28 21:53:34">]>

Should return just the second profile:

skills = %w(bar baz).uniq
  SQL (0.3ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" IN ('bar', 'baz') GROUP BY "profiles"."id" HAVING (count(skills.id) >= 2) LIMIT ?  [["LIMIT", 11]]
  SQL (0.2ms)  SELECT "profiles"."id" AS t0_r0, "profiles"."created_at" AS t0_r1, "profiles"."updated_at" AS t0_r2, "skills"."id" AS t1_r0, "skills"."name" AS t1_r1, "skills"."profile_id" AS t1_r2, "skills"."created_at" AS t1_r3, "skills"."updated_at" AS t1_r4 FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" IN ('bar', 'baz') AND "profiles"."id" = 2 GROUP BY "profiles"."id" HAVING (count(skills.id) >= 2)
 => #<ActiveRecord::Relation [#<Profile id: 2, created_at: "2017-07-28 21:53:34", updated_at: "2017-07-28 21:53:34">]>

Should return no profiles:

skills = %w(foo baz).uniq
Profile.includes(:skills).where('skills.name' => skills).group(:id).having("count(skills.id) >= #{skills.size}")
  SQL (0.3ms)  SELECT  DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "profiles_skills" ON "profiles_skills"."profile_id" = "profiles"."id" LEFT OUTER JOIN "skills" ON "skills"."id" = "profiles_skills"."skill_id" WHERE "skills"."name" IN ('foo', 'baz') GROUP BY "profiles"."id" HAVING (count(skills.id) >= 2) LIMIT ?  [["LIMIT", 11]]
 => #<ActiveRecord::Relation []>
like image 4
anothermh Avatar answered Nov 07 '22 09:11

anothermh