Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid brackets in SQL around Django custom database function call?

A short intoduction to the problem...

  • PostgreSQL has very neat array fields (int array, string array) and functions for them like UNNEST and ANY.
  • These fields are supported by Django (I am using djorm_pgarray for that), but functions are not natively supported.
  • One could use .extra(), but Django 1.8 introduced a new concept of database functions.

Let me provide a most primitive example of what I am basicly doing with all these. A Dealer has a list of makes that it supports. A Vehicle has a make and is linked to a dealer. But it happens that Vehicle's make does not match Dealer's make list, that is inevitable.

MAKE_CHOICES = [('honda', 'Honda'), ...]

class Dealer(models.Model):
    make_list = TextArrayField(choices=MAKE_CHOICES)

class Vehicle(models.Model):
    dealer = models.ForeignKey(Dealer, null=True, blank=True)
    make   = models.CharField(max_length=255, choices=MAKE_CHOICES, blank=True)

Having a database of dealers and makes, I want to count all vehicles for which the vehicle's make and its dealer's make list do match. That's how I do it avoiding .extra().

from django.db.models import functions

class SelectUnnest(functions.Func):
    function = 'SELECT UNNEST'

...

Vehicle.objects.filter(
    make__in=SelectUnnest('dealer__make_list')
).count()

Resulting SQL:

SELECT COUNT(*) AS "__count" FROM "myapp_vehicle" 
INNER JOIN "myapp_dealer" 
  ON ( "myapp_vehicle"."dealer_id" = "myapp_dealer"."id" ) 
WHERE "myapp_vehicle"."make" 
  IN (SELECT UNNEST("myapp_dealer"."make_list"))

And it works, and much faster than a traditional M2M approach we could use in Django. BUT, for this task, UNNEST is not a very good solution: ANY is much faster. Let's try it.

class Any(functions.Func):
    function = 'ANY'

...

Vehicle.objects.filter(
    make=Any('dealer__make_list')
).count()

It generates the following SQL:

 SELECT COUNT(*) AS "__count" FROM "myapp_vehicle" 
 INNER JOIN "myapp_dealer" 
   ON ( "myapp_vehicle"."dealer_id" = "myapp_dealer"."id" )
 WHERE "myapp_vehicle"."make" = 
  (ANY("myapp_dealer"."make_list"))

And it fails, because braces around ANY are bogus. If you remove them, it runs in the psql console with no problems, and fast.

So my question.

  1. Is there any way to remove these braces? I could not find anything about that in Django documentation.
  2. If not, - maybe there are other ways to rephrase this query?

P. S. I think that an extensive library of database functions for different backends would be very helpful for database-heavy Django apps.

Of course, most of these will not be portable. But you typically do not often migrate such a project from one database backend to another. In our example, using array fields and PostGIS we are stuck to PostgreSQL and do not intend to move.

Is anybody developing such a thing?

P. P. S. One might say that, in this case, we should be using a separate table for makes and intarray instead of string array, that is correct and will be done, but nature of problem does not change.

UPDATE.

  • TextArrayField is defined at djorm_pgarray. At the linked source file, you can see how it works.
  • The value is list of text strings. In Python, it is represented as a list. Example: ['honda', 'mazda', 'anything else'].

Here is what is said about it in the database.

=# select id, make from appname_tablename limit 3;
id | make
---+----------------------
58 | {vw}
76 | {lexus,scion,toyota}
39 | {chevrolet}

And underlying PostgreSQL field type is text[].

like image 239
Anatoly Scherbakov Avatar asked May 13 '16 09:05

Anatoly Scherbakov


1 Answers

I've managed to get (more or less) what you need using following:

from django.db.models.lookups import BuiltinLookup
from django.db.models.fields import Field

class Any(BuiltinLookup):
    lookup_name = 'any'

    def get_rhs_op(self, connection, rhs):
       return " = ANY(%s)" % (rhs,)

Field.register_lookup(Any)

and query:

Vehicle.objects.filter(make__any=F('dealer__make_list')).count()

as result:

SELECT COUNT(*) AS "__count" FROM "zz_vehicle"
  INNER JOIN "zz_dealer" ON ("zz_vehicle"."dealer_id" = "zz_dealer"."id")
    WHERE "zz_vehicle"."make"  = ANY(("zz_dealer"."make_list"))

btw. instead djorm_pgarray and TextArrayField you can use native django:

make_list = ArrayField(models.CharField(max_length=200), blank=True)

(to simplify your dependencies)

like image 157
Jerzyk Avatar answered Nov 15 '22 17:11

Jerzyk