Unless I am mistaken: joins
has better performance than includes
because at the database level:
joins
causes an inner join
includes
causes a subquery
And in general, an inner join
is faster than a subquery
.
Example:
#app/models/owner.rb
class Owner < ActiveRecord::Base
has_many :pets
end
#app/models/pet.rb
class Pet < ActiveRecord::Base
belongs_to :owner
end
Using rails console
:
# showing how 'includes' in rails causes an IN statement which is a subquery
irb(main):001:0> @owners = Owner.all.includes(:pets)
Owner Load (2.7ms) SELECT "owners".* FROM "owners"
Pet Load (0.4ms) SELECT "pets".* FROM "pets" WHERE "pets"."owner_id" IN (1, 2, 3)
And now using joins
which causes an inner join
:
irb(main):001:0> @owners = Owner.all.joins(:pets)
Owner Load (0.3ms) SELECT "owners".* FROM "owners" INNER JOIN "pets" ON "pets"."owner_id" = "owners"."id"
So it would seem like it would almost always be better the use joins
over includes
because:
includes
causes a subquery
(the IN
statement)joins
causes an inner join
which is usually faster than a subqueryHowever, there is one gotcha with using joins
. This article does a great job describing it. Basically, includes
loads all the associated objects into memory, so that if you query for any of the attributes for those associated objects, it doesn't hit the database. Meanwhile, joins
DOES NOT load into memory the associated objects' attributes, so if you query for any of the attributes, it makes additional hits on the database.
So here is my question: Is it possible to do inner joins like with joins
for performance but at the same time load all the associated objects into memory like includes
does?
Put another way: is it possible to load all the associated objects into memory like includes
does, but causes an inner join as opposed to a subquery?
Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database. It is an implementation of the Active Record pattern which itself is a description of an Object Relational Mapping system.
:joins uses inner join, :includes uses outer join. the main reason of :includes is eager loading, to avoid the N+1 problem of loading in attributes of each object using a separate query.
The active record pattern is an approach to accessing data in a database. A database table or view is wrapped into a class. Thus, an object instance is tied to a single row in the table. After creation of an object, a new row is added to the table upon save.
What is ActiveRecord? ActiveRecord is an ORM. It's a layer of Ruby code that runs between your database and your logic code. When you need to make changes to the database, you'll write Ruby code, and then run "migrations" which makes the actual changes to the database.
I think your assumption that a JOIN
is always faster than two queries is not correct. It highly depends on the size of your database tables.
Imagine you have thousands of owners and pets in your database. Then your database had to join all together first, even if you just want to load 10 records. On the other hand one query loading 10 owners and one query to load all pets for that 10 owners would be faster than that JOIN
.
I would argue that both methods exist to solve different problems:
joins
is used when you need to combine two tables to run a query on the data of both tables.includes
is used to avoid N+1 queries.Btw: The Rails documentation has a note that includes
has performance benefits over joins
:
This will often result in a performance improvement over a simple join.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With