Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails: Using .references on joins even though it is not required

I know that when you utilize includes and you specify a where clause on the joined table, you should use .references

example:

# will error out or throw deprecation warning in logs
customers = Customer.includes(:orders).where("Orders.cost < ?", 100)

Otherwise, in rails 4 or later, you will get an error like the following:

Mysql2::Error: Unknown column 'Orders.cost' in 'where clause': SELECT customers.* FROM customers WHERE (Orders.cost < 100)

Or you will get a deprecation warning:

DEPRECATION WARNING: It looks like you are eager loading table(s) (one of: users, addresses) that are referenced in a string SQL snippet. For example:

Post.includes(:comments).where("comments.title = 'foo'")

Currently, Active Record recognizes the table in the string, and knows to JOIN the comments table to the query, rather than loading comments in a separate query. However, doing this without writing a full-blown SQL parser is inherently flawed. Since we don't want to write an SQL parser, we are removing this functionality. From now on, you must explicitly tell Active Record when you are referencing a table from a string:

Post.includes(:comments).where("comments.title = 'foo'").references(:comments)

If you don't rely on implicit join references you can disable the feature entirely by setting config.active_record.disable_implicit_join_references = true. (

SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4, "addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."street" AS t1_r3, "addresses"."postal_code" AS t1_r4, "addresses"."city" AS t1_r5, "addresses"."created_at" AS t1_r6, "addresses"."updated_at" AS t1_r7 FROM "users" LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'Poland')

Makes sense. Rails does not want to be an SQL parser when you utilize includes. So to make rails happy we do this:

# No error and no deprecation warning because we are explicitly specifying what table is referenced within that where clause
customers = Customer.includes(:orders).where("Orders.cost < ?", 100).references(:orders)

However, when you use joins and a where clause on the joined table: Rails does not error out, and rails also does not throw a deprecation warning as it does with includes:

# No error and no deprecation warning.  What????
customers = Customer.joins(:orders).where("Orders.cost < ?", 100)

This does not make sense to me. I would think that if rails does not want to be an SQL parser for includes, then it also would not want to be an sql parser for joins. I would think that rails would prefer me to use references like so:

customers = Customer.joins(:orders).where("Orders.cost < ?", 100).references(:orders)

So my question(s):

  1. Am I missing something? Is it absolutely fine to not specify references for joins with where clauses on joined tables, even though it is pretty much required for includes? If so: why this difference between includes and joins?
  2. Should I specify references on joins moving forward? Perhaps once rails' sql parser goes away all those joins with where clauses on joined tables will no longer work anymore?
like image 553
Neil Avatar asked Feb 08 '23 03:02

Neil


2 Answers

The fundamental difference in between joins and includes (which is a wrapper for 2 methods : eager_load and preload) is that :

  • joinsbluntly executes SQL and returns all records indiscriminately
  • includeswill determine which records belongs to which model and build a tree of models : author1 -> [post, post, post], author2 -> [post, post]

For example if you join Order.joins(:product_items) you'll get :

  • all ProductItem records intertwined with Orders.
  • Whereas with Order.includes(:product_items) you'll only get Orders with product_items nested in each corresponding order.

So reference here is only useful to Rails when you do includes (or eager_load or preload) to help it differentiate the data coming out of a SQL join and assign it to the proper Models.

With joins you're on your own in SQL world. Most of the time I'll use it with group("orders.id") and an aggregate function :

select(orders.*, count(product_items.id) as cart_size)...

like image 145
charlysisto Avatar answered Apr 20 '23 00:04

charlysisto


The key difference here is that includes has 2 modes of operation: eager_load (which uses a join for each association) and preload (which runs a separate query for each association and thus doesn't support conditions that reference the joined tables)

Before rails 2 (or thereabouts) the only option was the join based one. When the new strategy was introduced it was made the default, with a fallback to the old strategy if it looked like it was needed in order to reduce regressions in people's apps.

That detection was always a bit messy and there was a never ending long tail of cases that weren't detected, hence the eventual decision to remove the fallback and require the explicit use of references to indicate which included associations were used in any clauses of the query. You can see this in the eager_loading? method that is used to decide which strategy to use.

You can also use preload or eager_load instead of includes to directly indicate which strategy you want (or to use different strategies for different associations).

On the other hand joins always does a join - it has never needed to parse your query because it simply doesn't need to - it will generate a join clause no matter what.

like image 30
Frederick Cheung Avatar answered Apr 20 '23 00:04

Frederick Cheung