Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django db error: could not identify an equality operator for type json when trying to annotate a model with jsonfield

I'm working in Django 1.5.4 and PostgreSQL 9.3, using django-jsonfield for JSONField.

Following query throws db error (could not identify an equality operator for type json):

ModelWithJsonField.objects.annotate(count=Count('field_to_count_by'))

The field_to_count_by is not JSONField, normal int field.

Any ideas how i can solve the issue and still use annotate?

What annotate does behind the hood?

like image 671
Neara Avatar asked Oct 01 '13 13:10

Neara


3 Answers

I ran into the same problem and finally (today) implemented a fake operator by running this as admin in the psql console :

-- This creates a function named hashjson that transforms the
-- json to texts and generates a hash
CREATE OR REPLACE FUNCTION hashjson(            
    json          
) RETURNS INTEGER LANGUAGE SQL STRICT IMMUTABLE AS $$ 
    SELECT hashtext($1::text); 
$$; 

-- This creates a function named json_eq that checks equality (as text)
CREATE OR REPLACE FUNCTION json_eq(            
    json, 
    json              
) RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$ 
    SELECT bttextcmp($1::text, $2::text) = 0; 
$$; 

-- This creates an operator from the equality function
CREATE OPERATOR = (            
    LEFTARG   = json, 
    RIGHTARG  = json, 
    PROCEDURE = json_eq 
); 

-- Finaly, this defines a new default JSON operator family with the
-- operators and functions we just defined.
CREATE OPERATOR CLASS json_ops
   DEFAULT FOR TYPE json USING hash AS
   OPERATOR 1  =,
   FUNCTION 1  hashjson(json);

(heavily inspired from this thread)

I also referenced your question in the django-jsonfield GitHub issue.

Note that :

  • I have very limited idea on the impact this will have. Maybe it's not really a good idea. The implementations are naive but they should be enough. Or maybe not.
  • In particular, the equality operator checks text equality and not semantic json equality. But as far as django-jsonField is concerned, I think there's little chance we really need the results to be correct (may be a SELECT FALSE would even do the trick).
like image 115
Joachim Jablon Avatar answered Nov 05 '22 11:11

Joachim Jablon


I went through the same issue, and then I tried Joachim Jablon's code, and although it seemed to work nicely, it still had issues. I'll get to the point here, the longest version is on my blog.

  • SELECT '{"a":1,"b":2}'::json = '{"b":2,"a":1}'::json returned false, because it is based on the string representation.
  • No sorting of fields is allowed, because the operator class is hash instead of btree.

Then, I created a json_cmp() function in PL/V8, that can be used to power the operators necessary for a btree.

Here is the complete SQL script

CREATE OR REPLACE FUNCTION json_cmp(left json, right json)
RETURNS integer AS $$
    function cleverType(obj) {
        var type = typeof obj;

        if (type === 'object') {
            if (obj === null) {
                type = 'null';
            } else if (obj instanceof Array) {
                type = 'array';
            }
        }

        return type;
    }

    function cmp(left, right) {
        var leftType = cleverType(left),
            rightType = cleverType(right),
            i,
            buf,
            leftKeys,
            rightKeys,
            output = 0;

        if (leftType !== rightType) {
            output = leftType.localeCompare(rightType);
        } else if (leftType === 'number'
                || leftType === 'boolean'
                || leftType === 'string') {
            if (left < right) {
                output = -1;
            } else if (left > right) {
                output = 1;
            } else {
                output = 0;
            }
        } else if (leftType === 'array') {
            if (left.length !== right.length) {
                output = cmp(left.length, right.length);
            } else {
                for (i = 0; i < left.length; i += 1) {
                    buf = cmp(left[i], right[i]);

                    if (buf !== 0) {
                        output = buf;
                        break;
                    }
                }
            }
        } else if (leftType === 'object') {
            leftKeys = Object.keys(left);
            rightKeys = Object.keys(right);

            if (leftKeys.length !== rightKeys.length) {
                leftKeys.sort();
                rightKeys.sort();
                buf = cmp(leftKeys, rightKeys);
            } else {
                buf = cmp(leftKeys.length, rightKeys.length);
            }

            if (buf !== 0) {
                output = buf;
            } else {
                for (i = 0; i < leftKeys.length; i += 1) {
                    buf = cmp(left[leftKeys[i]], right[leftKeys[i]]);

                    if (buf !== 0) {
                        output = buf;
                        break;
                    }
                }
            }
        }

        return output;
    }

    return cmp(left, right);
$$ LANGUAGE plv8 IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION json_eq(json, json)
RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
    SELECT json_cmp($1, $2) = 0;
$$;

CREATE OR REPLACE FUNCTION json_lt(json, json)
RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
    SELECT json_cmp($1, $2) < 0;
$$;

CREATE OR REPLACE FUNCTION json_lte(json, json)
RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
    SELECT json_cmp($1, $2) <= 0;
$$;

CREATE OR REPLACE FUNCTION json_gt(json, json)
RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
    SELECT json_cmp($1, $2) > 0;
$$;

CREATE OR REPLACE FUNCTION json_gte(json, json)
RETURNS BOOLEAN LANGUAGE SQL STRICT IMMUTABLE AS $$
    SELECT json_cmp($1, $2) >= 0;
$$;

CREATE OPERATOR =  (LEFTARG = json, RIGHTARG = json, PROCEDURE = json_eq);
CREATE OPERATOR <  (LEFTARG = json, RIGHTARG = json, PROCEDURE = json_lt);
CREATE OPERATOR <= (LEFTARG = json, RIGHTARG = json, PROCEDURE = json_lte);
CREATE OPERATOR >  (LEFTARG = json, RIGHTARG = json, PROCEDURE = json_gt);
CREATE OPERATOR >= (LEFTARG = json, RIGHTARG = json, PROCEDURE = json_gte);

CREATE OPERATOR CLASS json_ops
   DEFAULT FOR TYPE json USING btree AS
   OPERATOR 1 <,
   OPERATOR 2 <=,
   OPERATOR 3 =,
   OPERATOR 4 >=,
   OPERATOR 5 >,
   FUNCTION 1 json_cmp(json, json);

This tends to be a lot slower than simple string comparison, of course, but has the advantage of producing more robust results.

Please note that if you use South for your migrations, you can create an empty migration and execute the SQL from the forwards() method. This will automatically install the functions when you migrate your app.

like image 24
Xowap Avatar answered Nov 05 '22 12:11

Xowap


My solution uses PL/Python, it parses and re-dumps the json sorting the keys, and then yields the FNV1a hash of the result: https://github.com/ifad/chronomodel/blob/master/sql/json_ops.sql.

I am not using hashtext() because it is for internal use only: http://www.postgresql.org/message-id/[email protected].

It is not a silver bullet, just a crude hack. The real solution is to wait for full support in Postgres.

like image 24
vjt Avatar answered Nov 05 '22 13:11

vjt