Trying to do distinct on a mode with rails.
2.1.1 :450 > u.profiles.select("profiles.*").distinct
Profile Load (0.9ms) SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1 [["user_id", 2]]
PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1
ActiveRecord::StatementInvalid: PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = $1
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:834:in `prepare_statement'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:795:in `exec_cache'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:139:in `block in exec_query'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:442:in `block in log'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activesupport-4.0.4/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:437:in `log'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:137:in `exec_query'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `select'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:63:in `select_all'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/querying.rb:36:in `find_by_sql'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:585:in `exec_queries'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/association_relation.rb:15:in `exec_queries'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:471:in `load'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:220:in `to_a'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:573:in `inspect'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:90:in `start'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:9:in `start'
from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands.rb:62:in `<top (required)>'
from bin/rails:4:in `require'
from bin/rails:4:in `<main>'2.1.1 :451 >
Getting an error PG::UndefinedFunction: ERROR: could not identify an equality operator for type json
Converting to Hstore is not an option for me in this case. Any work arounds?
Removing duplicate rows from a query result set in PostgreSQL can be done using the SELECT statement with the DISTINCT clause. It keeps one row for each group of duplicates. The DISTINCT clause can be used for a single column or for a list of columns.
Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set. Since DISTINCT operates on all of the fields in SELECT's column list, it can't be applied to an individual field that are part of a larger group.
The DISTINCT clause is used in the SELECT statement to remove duplicate rows from a result set. The DISTINCT clause keeps one row for each group of duplicates. The DISTINCT clause can be applied to one or more columns in the select list of the SELECT statement.
PostgreSQL also provides on an expression as DISTINCT ON that is used with the SELECT statement to remove duplicates from a query set result just like the DISTINCT clause.In addition to that it also keeps the “first row” of each row of duplicates in the query set result.
The reason behind this, is that in PostgreSQL (up to 9.3) there is no equality operator defined for json
(i.e. val1::json = val2::json
will always throw this exception) -- in 9.4 there will be one for the jsonb
type.
One workaround is, you can cast your json
field to text
. But that won't cover all json equalitions. f.ex. {"a":1,"b":2}
should be equal to {"b":2,"a":1}
, but won't be equal if casted to text
.
Another workaround is (if you have a primary key for that table -- which should be) you can use the DISTINCT ON (<expressions>)
form:
u.profiles.select("DISTINCT ON (profiles.id) profiles.*")
Note: One known caveat for DISTINCT ON
:
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.
Sorry I'm late on this answer, but it might help others.
As I understand your query, you're only getting possible duplicates on profiles
because of the many-to-many join to integrations
(which you're using to determine which profiles
to access).
Because of that, you can use a new GROUP BY
feature as of 9.1:
When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.
So in your case, you could get Ruby to create the query (sorry, I don't know the Ruby syntax you're using)...
SELECT profiles.*
FROM "profiles"
INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id"
INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id"
WHERE "integrations"."user_id" = $1
GROUP BY "profiles"."id"
I only removed the DISTINCT
from your SELECT
clause and added the GROUP BY
.
By referring ONLY to the id
in the GROUP BY
, you take advantage of that new feature because all the remaining profiles
columns are "functionally dependent" on that id primary key.
Somehow, wonderfully that avoids the need for Postgres to do equality checks on the dependent columns (ie your json
column in this case).
The DISTINCT ON
solution is also great, and clearly sufficient in your case, but you can't use aggregate functions like array_agg
with it. You CAN with this GROUP BY
approach. Happy days! :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With