Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(Rails) When to use ActiveRecord's .where and .select

I have a Model in my Rails app named User, which has an attribute name

Let's say I want to find all Users whose name matches a specific string.

In a Rails Console, I could do:

OPTION 1

User.where("name LIKE 'string'")

This is very fast.

OPTION 2

User.select{|u| u.name == "string"}

This is extremely slow. It works for a small database, but if you have hundreds of thousands of users, this appears to try to load all Users into memory first, and then iterate them in the block.

Does this mean that option 2 is always wrong ? What's the correct use case for .select then, when is it preferable over .where ?

I wrote a bunch of code in my app that is working fine using .select, but now that I tried using over a very large table I saw that I might be doing something wrong.

like image 781
sandre89 Avatar asked Dec 02 '15 03:12

sandre89


People also ask

What is the difference between find and Find_by in rails?

The additional difference between find() and find_by() is that find could only be used to search by primary key (usually the 'id') while the find_by() requires and searches by attribute (either passed as hash like Employee. find_by(name: 'Mike') or using the Employee.

Is ActiveRecord an ORM?

ActiveRecord is an ORM. It's a layer of Ruby code that runs between your database and your logic code.

What is ActiveRecord in Ruby on Rails?

Active Record is the M in MVC - the model - which is the layer of the system responsible for representing business data and logic. Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.


1 Answers

Select: http://apidock.com/rails/ActiveRecord/QueryMethods/select

"This will build an array of objects from the database for the scope, converting them into an array and iterating through them using Array#select."

So, it's so slow because it fetches EVERYTHING from the database, converts them to Ruby objects one by one, then runs the block you give on them. If you've got a database table with 100,000 rows, that means instantiating 100,000 ruby objects and running your block on them. Ruby is slow, so this is slow.

Using where will basically just put together an SQL statement, and let the database handle it. Databases are FAST - if speed is a priority, then whenever you can defer to the database, you should do so. If you have a database table with 100,000 rows, but only 10 match your where query, then the database will handle the filtering, and Ruby will only receive 10 rows back and will therefore only have to instantiate 10 Ruby objects, not 100,000.

Almost always, you'll want to use where rather than select. So, when might you use select? Well you'd use it (often in conjunction with where) when you want to filter a relatively small number of records based on some ruby conditions that would be reasonably hard to translate to SQL.

Example:

User.where(account_cancelled: true).select do |u|
  # perform complex ruby logic on user 
  # that would be hard to do with plain SQL
  # (note, this will only act on the subset of 
  # users returned from the where clause)
end

So, whenever you reasonably can use where rather than select, you should do so.

(there's also a second use of select mentioned in the documentation: "Modifies the SELECT statement for the query so that only certain fields are retrieve" - but that's not the use you're dealing with in your question)

like image 148
joshua.paling Avatar answered Oct 13 '22 00:10

joshua.paling