The Problem
I'm trying to use the Django ORM to do the equivalent of a SQL NOT IN
clause, providing a list of IDs in a subselect to bring back a set of records from the logging table. I can't figure out if this is possible.
The Model
class JobLog(models.Model):
job_number = models.BigIntegerField(blank=True, null=True)
name = models.TextField(blank=True, null=True)
username = models.TextField(blank=True, null=True)
event = models.TextField(blank=True, null=True)
time = models.DateTimeField(blank=True, null=True)
What I've Tried
My first attempt was to use exclude
, but this does NOT
to negate the entire Subquery
, rather than the desired NOT IN
:
query = (
JobLog.objects.values(
"username", "job_number", "name", "time",
)
.filter(time__gte=start, time__lte=end, event="delivered")
.exclude(
job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)
)
)
Unfortunately, this yields this SQL:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log"
WHERE (
"view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
AND NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished' AND U0."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
)
What I need is for the third AND
clause to be AND "view_job_log"."job_number" NOT IN
instead of the AND NOT (
.
I've also tried doing the sub-select as it's own query first, with an exclude
, as suggested here:
Django equivalent of SQL not in
However, this yields the same problematic result. Then I tried a Q
object, which yields a similar query:
query = (
JobLog.objects.values(
"username", "subscriber_code", "job_number", "name", "time",
)
.filter(
~models.Q(job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)),
time__gte=start,
time__lte=end,
event="delivered",
)
)
This attempt with the Q
object yields the following SQL, again, without the NOT IN
:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log" WHERE (
NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished'
AND U0."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
AND "view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
Is there any way to get Django's ORM to do something equivalent to AND job_number NOT IN (12345, 12346, 12347)
? Or am I going to have to drop to raw SQL to accomplish this?
Thanks in advance for reading this entire wall-of-text question. Explicit is better than implicit. :)
ORM stands for Object Relation Mapper. Django ORM is a powerful and elegant way to interact with the database. The Django ORM is an abstraction layer that allows us to play with the database. In the end, Django ORM will convert all operations into SQL statements. In this piece, We will learn ORM of some common SQL queries.
The solution provided on that post, and another provided on Django's forum, seem like a reasonable addition to Django. Lacking support for the equivalent of SQL's id NOT IN (1, 2, 3) is a hole in the ORM feature set, especially now that we have Subquery.
4. How to do a NOT query in Django queryset? 4. How to do a NOT query in Django queryset? ¶ If you are using django.contrib.auth, you will have a table called auth_user. It will have fields as username, first_name, last_name and more. Say you want to fetch all users with id NOT < 5. You need a NOT operation. Django provides two options. 4.1.
In Django ORM values () method is used to select a few column values of the table. ‘__in’ is used to filter on multiple values. Excludes objects from the queryset which match with the lookup parameters. The extra () method is used to rename columns in the ORM. In this ORM, I’ve renamed first_name to FirstName and last_name to LastName.
I think the easiest way to do this would be to define a custom lookup, similar to this one or the in lookup
from django.db.models.lookups import In as LookupIn
class NotIn(LookupIn):
lookup_name = "notin"
def get_rhs_op(self, connection, rhs):
return "NOT IN %s" % rhs
Field.register_lookup(NotIn)
or
class NotIn(models.Lookup):
lookup_name = "notin"
def as_sql(self, compiler, connection):
lhs, params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params.extend(rhs_params)
return "%s NOT IN %s" % (lhs, rhs), params
then use it in your query:
query = (
JobLog.objects.values(
"username", "job_number", "name", "time",
)
.filter(time__gte=start, time__lte=end, event="delivered")
.filter(
job_number__notin=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)
)
)
this generates the SQL:
SELECT
"people_joblog"."username",
"people_joblog"."job_number",
"people_joblog"."name",
"people_joblog"."time"
FROM
"people_joblog"
WHERE ("people_joblog"."event" = delivered
AND "people_joblog"."time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
AND "people_joblog"."time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00
AND "people_joblog"."job_number" NOT IN (
SELECT
U0. "job_number"
FROM
"people_joblog" U0
WHERE (U0. "event" = finished
AND U0. "time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
AND U0. "time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00)))
You can likely achieve the same results by using an Exists
and special casing NULL
s.
.filter(
~Exists(
JobLog.objects.filter(
Q(jobnumber=None) | Q(jobnumber=OuterRef('jobnumber')),
time__gte=start,
time__lte=end,
event='finished',
)
)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With