Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

For new ActiveRecord models, why do some has_many :through associations add a (1=0) predicate and distinct clause to the sql query?

Whenever I instantiate a new ActiveRecord model (one that has not been persisted to the database) and attempt to access some various associations on the built model, the Rails query builder will sometimes:

  1. Add a (1=0) predicate to the where clause of the query.
  2. Add a 'distinct` clause to the select statement.

I think this only occurs when the has_many :through association is joining two or more tables.

I want to know why it adds the (1=0) predicate as well as the distinct clause. For the (1=0) predicate, it shouldn't matter if the new model has been saved to the database or not (right?). I have no idea why the distinct clause is being added.

I have a simple example below.

class Assignment < ActiveRecord::Base
  has_many :assignment_attachments
  has_many :attachments, through: :assignment_attachments
end

class AssignmentAttachment < ActiveRecord::Base
  belongs_to :assignment
  belongs_to :attachment
end

class Attachment < ActiveRecord::Base
  has_many :assignment_attachments
  has_many :assignments, through: :assignment_attachments
end

class Submission < ActiveRecord::Base
  belongs_to :assignment
  has_many :assignment_attachments, through: :assignment
  has_many :attachments, through: :assignment
end

s = Submission.new(assignment: Assignment.first)
s.assignment #=> #<Assignment ...>
s.assignment_attachments #=> [#AssignmentAttachment id: '1'>, #AssignmentAttachment assignment_id: '1', attachment_id: '1' ...>]
s.attachments #=> []

Here's the sql query for s.attachments:

SELECT DISTINCT attachments.*
FROM attachments
INNER JOIN assignment_attachments ON attachments.id = assignment_attachments.attachment_id
INNER JOIN assignments ON assignment_attachments.assignment_id = assignments.id
WHERE assignments.id = 'a0dbfdc7-0d67-4aad-ad06-6a7a5a91d2d0' AND (1=0)

Located somewhere deep in one of the subtrees of the abstract syntax tree that arel builds:

# the 'distinct' select clause
#<Arel::Nodes::SelectCore:0x007ffe43d45be0
  @groups=[],
  @having=nil,
  @projections=
   [#<struct Arel::Attributes::Attribute
     relation=
      #<Arel::Table:0x007ffe45a7be58
       @aliases=[],
       @columns=nil,
       @engine=
        Attachment(name: string, description: text, created_at: datetime, updated_at: datetime, required: boolean, id: uuid, slug: string, file_types: string),
       @name="attachments",
       @primary_key=nil,
       @table_alias=nil>,
     name="*">],
  @set_quantifier=#<Arel::Nodes::Distinct:0x007ffe43d44dd0>,
  ...

# the (1=0) predicate
@wheres=
 [#<Arel::Nodes::And:0x007ffe43d45028
   @children=
    [#<Arel::Nodes::Equality:0x007ffe45958788
      @left=
       #<struct Arel::Attributes::Attribute
        relation=
         #<Arel::Table:0x007ffe45958e68
          @aliases=[],
          @columns=nil,
          @engine=ActiveRecord::Base,
          @name="assignments",
          @primary_key=nil,
          @table_alias=nil>,
        name="id">,
      @right=#<Arel::Nodes::BindParam:0x007ffe45958878>>,
     #<Arel::Nodes::Grouping:0x007ffe43d45050 @expr="1=0">]>]

Do you know why arel is building the distinct clause and the (1=0) predicate? I can use some workarounds to get what I want - however, I would love to be able to investigate and find out why and how this tree is built.

Thanks for any/all advice.

like image 331
Kurt Mueller Avatar asked Oct 13 '15 22:10

Kurt Mueller


1 Answers

Note this is an edit to reflect new info from OP. With the new info I have recreated the scenario in a fresh project by copy and pasting OP's code plus the following migration.

Migration:

class CreateAttachments < ActiveRecord::Migration
  def change
    create_table :attachments do |t|
      t.integer :assignment_attachment_id
      t.timestamps null: false
    end

    create_table :assignments do |t|
      t.integer :assignment_attachment_id
      t.timestamps null: false
    end

    create_table :assignment_attachments do |t|
      t.integer :assignment_id
      t.integer :attachment_id
      t.timestamps null: false
    end

    create_table :submissions do |t|
      t.integer :assignment_id
      t.timestamps null: false
    end
  end
end

Works fine when associations exist in DB

When I save the Attachment and Assignment models to which I will associate my Submission model s, even though never save s, it works fine.

irb(main):001:0> attachment = Attachment.new
=> #<Attachment id: nil, assignment_attachment_id: nil, created_at: nil, updated_at: nil>
irb(main):003:0> assignment = Assignment.new
=> #<Assignment id: nil, assignment_attachment_id: nil, created_at: nil, updated_at: nil>
irb(main):005:0> assignment.attachments << attachment
=> #<ActiveRecord::Associations::CollectionProxy [#<Attachment id: nil, assignment_attachment_id: nil, created_at: nil, updated_at: nil>]>
irb(main):006:0> assignment.save
   (0.2ms)  begin transaction
  SQL (1.5ms)  INSERT INTO "assignments" ("created_at", "updated_at") VALUES (?, ?)  [["created_at", "2015-10-14 17:10:34.936929"], ["updated_at", "2015-10-14 17:10:34.936929"]]
  SQL (0.6ms)  INSERT INTO "attachments" ("created_at", "updated_at") VALUES (?, ?)  [["created_at", "2015-10-14 17:10:34.944453"], ["updated_at", "2015-10-14 17:10:34.944453"]]
  SQL (0.3ms)  INSERT INTO "assignment_attachments" ("assignment_id", "attachment_id", "created_at", "updated_at") VALUES (?, ?, ?, ?)  [["assignment_id", 1], ["attachment_id", 1], ["created_at", "2015-10-14 17:10:34.947481"], ["updated_at", "2015-10-14 17:10:34.947481"]]
   (0.8ms)  commit transaction
=> true
irb(main):007:0> s = Submission.new(assignment: Assignment.first)
  Assignment Load (0.2ms)  SELECT  "assignments".* FROM "assignments"  ORDER BY "assignments"."id" ASC LIMIT 1
=> #<Submission id: nil, assignment_id: 1, created_at: nil, updated_at: nil>
irb(main):008:0> s.assignment
=> #<Assignment id: 1, assignment_attachment_id: nil, created_at: "2015-10-14 17:10:34", updated_at: "2015-10-14 17:10:34">
irb(main):009:0> s.assignment_attachments
  AssignmentAttachment Load (0.2ms)  SELECT "assignment_attachments".* FROM "assignment_attachments" INNER JOIN "assignments" ON "assignment_attachments"."assignment_id" = "assignments"."id" WHERE "assignments"."id" = ?  [["id", 1]]
=> #<ActiveRecord::Associations::CollectionProxy [#<AssignmentAttachment id: 1, assignment_id: 1, attachment_id: 1, created_at: "2015-10-14 17:10:34", updated_at: "2015-10-14 17:10:34">]>
irb(main):010:0> s.attachments
  Attachment Load (0.2ms)  SELECT "attachments".* FROM "attachments" INNER JOIN "assignment_attachments" ON "attachments"."id" = "assignment_attachments"."attachment_id" INNER JOIN "assignments" ON "assignment_attachments"."assignment_id" = "assignments"."id" WHERE "assignments"."id" = ?  [["id", 1]]
=> #<ActiveRecord::Associations::CollectionProxy [#<Attachment id: 1, assignment_attachment_id: nil, created_at: "2015-10-14 17:10:34", updated_at: "2015-10-14 17:10:34">]>
irb(main):011:0> s.attachments.to_sql
=> "SELECT \"attachments\".* FROM \"attachments\" INNER JOIN \"assignment_attachments\" ON \"attachments\".\"id\" = \"assignment_attachments\".\"attachment_id\" INNER JOIN \"assignments\" ON \"assignment_attachments\".\"assignment_id\" = \"assignments\".\"id\" WHERE \"assignments\".\"id\" = 1"

Generates "1=0" predicate when there are no associations

If I do not specify any associations to the Submission instance, however, when I ask for its attachments, the query does get the 1=0 predicate. As this post describes, the 1=0 predicate is added when you are trying to retrieve records that join on an array of ids that is actually empty. This is true here, since we made sure that there are no assignment ids we can use for joining to attachment.

irb(main):007:0> Submission.new.attachments.to_sql
=> "SELECT \"attachments\".* FROM \"attachments\" INNER JOIN \"assignment_attachments\" ON \"attachments\".\"id\" = \"assignment_attachments\".\"attachment_id\" INNER JOIN \"assignments\" ON \"assignment_attachments\".\"assignment_id\" = \"assignments\".\"id\" WHERE \"assignments\".\"id\" = NULL AND (1=0)"
irb(main):008:0>

Notice how it says WHERE \"assignments\".\"id\" = NULL. It can't leave it at that because it doesn't want to assume that there aren't null ids in the assignments table that would cause it to return false positives. With the additional 1=0 predicate, you are ensured to get the correct answer: an empty result.

DISTINCT

I am unable to reproduce a scenario where DISTINCT appears.

like image 175
Rob Wise Avatar answered Nov 16 '22 14:11

Rob Wise