Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I avoid multiple queries with :include in Rails?

If I do this

post = Post.find_by_id(post_id, :include => :comments)

two queries are performed (one for post data and and another for the post's comments). Then when I do post.comments, another query is not performed because data is already cached.

Is there a way to do just one query and still access the comments via post.comments?

like image 897
Chad Johnson Avatar asked Jun 06 '11 00:06

Chad Johnson


People also ask

How to Avoid n 1 queries in Rails?

You can avoid most n+1 queries in rails by simply eager loading associations. Eager loading allows you to load all of your associations (parent and children) once instead of n+1 times (which often happens with lazy loading, rails' default). As seen above, . includes allows nested association eager loading!

How to Avoid n 1 queries?

Now that you understand the problem it can typically be avoided by doing a join fetch in your query. This basically forces the fetch of the lazy loaded object so the data is retrieved in one query instead of n+1 queries.

What is n 1 query problem and how to Avoid it in Rails?

The n+1 query problem is one of the most common scalability bottlenecks. It involves fetching a list of resources from a database that includes other associated resources within them. This means that we might have to query for the associated resources separately.

When to use joins and includes in rails?

Both are used for the same purpose. Includes: Uses eager loading, When we want to fetch data along with an associated table then includes must be used. Joins: Uses lazy loading. We can use joins when we want to consider the data as a condition from the joined table but not using any attributes from the table.


1 Answers

No, there is not. This is the intended behavior of :include, since the JOIN approach ultimately comes out to be inefficient.

For example, consider the following scenario: the Post model has 3 fields that you need to select, 2 fields for Comment, and this particular post has 100 comments. Rails could run a single JOIN query along the lines of:

SELECT post.id, post.title, post.author_id, comment.id, comment.body
FROM posts
INNER JOIN comments ON comment.post_id = post.id
WHERE post.id = 1

This would return the following table of results:

 post.id | post.title | post.author_id | comment.id | comment.body
---------+------------+----------------+------------+--------------
       1 | Hello!     |              1 |          1 | First!
       1 | Hello!     |              1 |          2 | Second!
       1 | Hello!     |              1 |          3 | Third!
       1 | Hello!     |              1 |          4 | Fourth!
...96 more...

You can see the problem already. The single-query JOIN approach, though it returns the data you need, returns it redundantly. When the database server sends the result set to Rails, it will send the post's ID, title, and author ID 100 times each. Now, suppose that the Post had 10 fields you were interested in, 8 of which were text blocks. Eww. That's a lot of data. Transferring data from the database to Rails does take work on both sides, both in CPU cycles and RAM, so minimizing that data transfer is important for making the app run faster and leaner.

The Rails devs crunched the numbers, and most applications run better when using multiple queries that only fetch each bit of data once rather than one query that has the potential to get hugely redundant.

Of course, there comes a time in every developer's life when a join is necessary in order to run complex conditions, and that can be achieved by replacing :include with :joins. For prefetching relationships, however, the approach Rails takes in :include is much better for performance.

like image 191
Matchu Avatar answered Sep 23 '22 21:09

Matchu