Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use ANY instead of IN in a WHERE clause with Rails?

I used to have a query like:

MyModel.where(id: ids)

Which generates sql query like:

SELECT "my_models".* FROM "my_models"
WHERE  "my_models"."id" IN (1, 28, 7, 8, 12)

Now I want to change this to use ANY instead of IN. I created this:

MyModel.where("id = ANY(VALUES(#{ids.join '),('}))"

Now when I use empty array ids = [] I get the folowing error:

MyModel Load (53.0ms)  SELECT "my_models".* FROM "my_models"  WHERE (id = ANY(VALUES()))
ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")"
ActiveRecord::StatementInvalid: ActiveRecord::JDBCError: org.postgresql.util.PSQLException: ERROR: syntax error at or near ")"
Position: 75: SELECT "social_messages".* FROM "social_messages"  WHERE (id = ANY(VALUES()))
    from arjdbc/jdbc/RubyJdbcConnection.java:838:in `execute_query'
like image 358
Eki Eqbal Avatar asked Jul 02 '15 17:07

Eki Eqbal


1 Answers

There are two variants of IN expressions:

  • expression IN (subquery)
  • expression IN (value [, ...])

Similarly, two variants with the ANY construct:

  • expression operator ANY (subquery)
  • expression operator ANY (array expression)

A subquery works for either technique, but for the second form of each, IN expects a list of values (as defined in standard SQL) while = ANY expects an array.

Which to use?

ANY is a later, more versatile addition, it can be combined with any binary operator returning a boolean value. IN burns down to a special case of ANY. In fact, its second form is rewritten internally:

IN is rewritten with = ANY
NOT IN is rewritten with <> ALL

Check the EXPLAIN output for any query to see for yourself. This proves two things:

  • IN can never be faster than = ANY.
  • = ANY is not going to be substantially faster.

The choice should be decided by what's easier to provide: a list of values or an array (possibly as array literal - a single value).

If the IDs you are going to pass come from within the DB anyway, it is much more efficient to select them directly (subquery) or integrate the source table into the query with a JOIN (like @mu commented).

To pass a long list of values from your client and get the best performance, use an array, unnest() and join, or provide it as table expression using VALUES (like @PinnyM commented). But note that a JOIN preserves possible duplicates in the provided array / set while IN or = ANY do not. More:

  • Optimizing a Postgres query with a large IN

In the presence of NULL values, NOT IN is often the wrong choice and NOT EXISTS would be right (and faster, too):

  • Select rows which are not present in other table

Syntax for = ANY

For the array expression Postgres accepts:

  • an array constructor (array is constructed from a list of values on the Postgres side) of the form: ARRAY[1,2,3]
  • or an array literal of the form '{1,2,3}'.

To avoid invalid type casts, you can cast explicitly:

ARRAY[1,2,3]::numeric[]
'{1,2,3}'::bigint[]

Related:

  • PostgreSQL: Issue with passing array to procedure
  • How to pass custom type array to Postgres function

Or you could create a Postgres function taking a VARIADIC parameter, which takes individual arguments and forms an array from them:

  • Passing multiple values in single parameter

How to pass the array from Ruby?

Assuming id to be integer:

MyModel.where('id = ANY(ARRAY[?]::int[])', ids.map { |i| i})

But I am just dabbling in Ruby. @mu provides detailed instructions in this related answer:

  • Sending array of values to a sql query in ruby?
like image 112
Erwin Brandstetter Avatar answered Sep 22 '22 06:09

Erwin Brandstetter