Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveRecord query changing when a dot/period is in condition value

See the updates at the bottom. I've narrowed this down significantly.

I've also created a barebones app demonstrating this bug: https://github.com/coreyward/bug-demo

And I've also created a bug ticket in the official tracker: https://rails.lighthouseapp.com/projects/8994/tickets/6611-activerecord-query-changing-when-a-dotperiod-is-in-condition-value

If someone can either tell me how to monkey-patch this or explain where this is happening in Rails I'd be very grateful.


I'm getting some bizarre/unexpected behavior. That'd lead me to believe either there is a bug (confirmation that this is a bug would be a perfect answer), or I am missing something that is right under my nose (or that I don't understand).

The Code

class Gallery < ActiveRecord::Base
  belongs_to :portfolio
  default_scope order(:ordinal)
end

class Portfolio < ActiveRecord::Base
  has_many :galleries
end

# later, in a controller action
scope = Portfolio.includes(:galleries) # eager load galleries
if some_condition
  @portfolio = scope.find_by_domain('domain.com')
else
  @portfolio = scope.find_by_vanity_url('vanity_url')
end
  • I have Portfolios which can have multiple Galleries each.
  • The galleries have ordinal, vanity_url, and domain attributes.
  • The gallery ordinals are set as integers from zero on up. I've confirmed that this works as expected by checking Gallery.where(:portfolio_id => 1).map &:ordinal, which returns [0,1,2,3,4,5,6] as expected.
  • Both vanity_url and domain are t.string, :null => false columns with unique indexes.

The Problem

If some_condition is true and find_by_domain is run, the galleries returned do not respect the default scope. If find_by_vanity_url is run, the galleries are ordered according to the default scope. I looked at the queries being generated, and they are very different.

The Queries

# find_by_domain SQL: (edited out additional selected columns for brevity)

Portfolio Load (2.5ms)  SELECT DISTINCT `portfolios`.id FROM `portfolios` LEFT OUTER JOIN `galleries` ON `galleries`.`portfolio_id` = `portfolios`.`id` WHERE `portfolios`.`domain` = 'lvh.me' LIMIT 1
Portfolio Load (0.4ms)  SELECT `portfolios`.`id` AS t0_r0, `portfolios`.`vanity_url` AS t0_r2, `portfolios`.`domain` AS t0_r11, `galleries`.`id` AS t1_r0, `galleries`.`portfolio_id` AS t1_r1, `galleries`.`ordinal` AS t1_r6 FROM `portfolios` LEFT OUTER JOIN `galleries` ON `galleries`.`portfolio_id` = `portfolios`.`id` WHERE `portfolios`.`domain` = 'lvh.me' AND `portfolios`.`id` IN (1)

# find_by_vanity_url SQL:

Portfolio Load (0.4ms)  SELECT `portfolios`.* FROM `portfolios` WHERE `portfolios`.`vanity_url` = 'cw' LIMIT 1
Gallery Load (0.3ms)  SELECT `galleries`.* FROM `galleries` WHERE (`galleries`.portfolio_id = 1) ORDER BY ordinal

So the query generated by find_by_domain doesn't have an ORDER statement, hence things aren't being ordered as desired. My question is...

Why is this happening? What is prompting Rails 3 to generate different queries to these two columns?


Update

This is really strange. I've considered and ruled out all of the following:

  • Indexes on the columns
  • Reserved/special words in Rails
  • A column name collision between the tables (ie. domain being on both tables)
  • The field type, both in the DB and Schema
  • The "allow null" setting
  • The separate scope

I get the same behavior as find_by_vanity_url with location, phone, and title; I get the same behavior as find_by_domain with email.


Another Update

I've narrowed it down to when the parameter has a period (.) in the name:

find_by_something('localhost') # works fine
find_by_something('name_routed_to_127_0_0_1') # works fine
find_by_something('my_computer.local') # fails
find_by_something('lvh.me') #fails

I'm not familiar enough with the internals to say where the query formed might change based on the value of a WHERE condition.

like image 603
coreyward Avatar asked Mar 04 '11 21:03

coreyward


1 Answers

The difference between the two strategies for eager loading are discussed in the comments here

https://github.com/rails/rails/blob/3-0-stable/activerecord/lib/active_record/association_preload.rb

From the documentation:

# The second strategy is to use multiple database queries, one for each
# level of association. Since Rails 2.1, this is the default strategy. In
# situations where a table join is necessary (e.g. when the +:conditions+
# option references an association's column), it will fallback to the table
# join strategy.

I believe that the dot in "foo.bar" is causing active record to think that you are putting a condition on a table that is outside of the originating model which prompts the second strategy discussed in the documentation.

The two separate queries runs one with the Person model and the second with the Item model.

 Person.includes(:items).where(:name => 'fubar')

Person Load (0.2ms)  SELECT "people".* FROM "people" WHERE "people"."name" = 'fubar'
Item Load (0.4ms)  SELECT "items".* FROM "items" WHERE ("items".person_id = 1) ORDER BY items.ordinal

Because you run the second query against the Item model, it inherits the default scope where you specified order(:ordinal).

The second query, which it attempts eager loading with the full runs off the person model and will not use the default scope of the association.

 Person.includes(:items).where(:name => 'foo.bar')

Person Load (0.4ms)  SELECT "people"."id" AS t0_r0, "people"."name" AS t0_r1, 
"people"."created_at" AS t0_r2, "people"."updated_at" AS t0_r3, "items"."id" AS t1_r0, 
"items"."person_id" AS t1_r1, "items"."name" AS t1_r2, "items"."ordinal" AS t1_r3, 
"items"."created_at" AS t1_r4, "items"."updated_at" AS t1_r5 FROM "people" LEFT OUTER JOIN 
"items" ON "items"."person_id" = "people"."id" WHERE "people"."name" = 'foo.bar'

It is a little buggy to think that, but I can see how it would be with the several different ways you can present a list of options, the way to be sure that you catch all of them would be to scan the completed "WHERE" conditions for a dot and use the second strategy, and they leave it that way because both strategies are functional. I would actually go as far as saying that the aberrant behavior is in the first query, not the second. If you would like the ordering to persist for this query, I recommend one of the following:

1) If you want the association to have an order by when it is called, then you can specify that with the association. Oddly enough, this is in the documentation, but I could not get it to work.

Source: http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html#method-i-has_many

class Person < ActiveRecord::Base
  has_many :items, :order => 'items.ordinal'
end

2) Another method would be to just add the order statement to the query in question.

Person.includes(:items).where(:name => 'foo.bar').order('items.ordinal')

3) Along the same lines would be setting up a named scope

class Person < ActiveRecord::Base
  has_many :items
  named_scope :with_items, includes(:items).order('items.ordinal')
end

And to call that:

Person.with_items.where(:name => 'foo.bar')
like image 89
Geoff Lanotte Avatar answered Sep 28 '22 11:09

Geoff Lanotte