Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Primary key requirement in raw SQL complicates the query in Django

To get max value from a simple table of values, I can write the following query in Django:

MyTable.objects.aggregate(Max('value'))

The SQL generated is : 'SELECT MAX("mytable"."value") AS "value__max" FROM "mytable"'

Now if I write the same SQL using the raw query manager:

1. MyTable.objects.raw('SELECT max(value) FROM mytable')

Django throws an error InvalidQuery: Raw query must include the primary key. This is also mentioned in Django docs: "There is only one field that you can’t leave out - the primary key field". So after adding the id field, I need GROUP BY as well. The new query becomes:

2. MyTable.objects.raw('SELECT id, max(value) FROM mytable GROUP BY id')

This doesn't give me a single max value anymore because I'm forced to use GROUP BY id. Now I need to add an ORDER BY and LIMIT statement to get the expected answer for an otherwise simple SQL statement that work.

3. MyTable.objects.raw('SELECT id, max(value) AS mv FROM mytable GROUP BY id ORDER BY mv DESC LIMIT 1')

Is there a way simplify the above query i.e. not use ORDER/LIMIT/GROUP BY (FWIW, using PosgreSQL)?

Update:

Here's a hack that'll work. I alias the max value as id to make Django happy. Is there any issue here?

MyTable.objects.raw('SELECT max(value) AS id FROM mytable')

Update 2:

Here's the query plan for the simple SQL (1) vs the complicated final one (3):

"Aggregate  (cost=5.25..5.26 rows=1 width=2) (actual time=0.155..0.155 rows=1 loops=1)"
"  ->  Seq Scan on mytable  (cost=0.00..4.60 rows=260 width=2) (actual time=0.018..0.067 rows=260 loops=1)"
"Total runtime: 0.222 ms"


"Limit  (cost=9.80..9.80 rows=1 width=6) (actual time=0.548..0.548 rows=1 loops=1)"
"  ->  Sort  (cost=9.80..10.45 rows=260 width=6) (actual time=0.545..0.545 rows=1 loops=1)"
"        Sort Key: (max(value))"
"        Sort Method: top-N heapsort  Memory: 25kB"
"        ->  HashAggregate  (cost=5.90..8.50 rows=260 width=6) (actual time=0.328..0.432 rows=260 loops=1)"
"              ->  Seq Scan on mytable  (cost=0.00..4.60 rows=260 width=6) (actual time=0.018..0.069 rows=260 loops=1)"
"Total runtime: 0.638 ms"

P.S. The actual query is more complicated (somewhat related to this answer : https://dba.stackexchange.com/a/86404/52114)

like image 288
user4150760 Avatar asked Dec 19 '14 06:12

user4150760


People also ask

How use raw SQL query in Django?

Django gives you two ways of performing raw SQL queries: you can use Manager. raw() to perform raw queries and return model instances, or you can avoid the model layer entirely and execute custom SQL directly. Explore the ORM before using raw SQL!

What exception is raised if the primary key is excluded from a raw query?

An InvalidQuery exception will be raised if you forget to include the primary key.

What is a raw query in SQL?

Raw SQL, sometimes also called native SQL, is the most basic, most low-level form of database interaction. You tell the database what to do in the language of the database. Most developers should know basics of SQL. This means how to CREATE tables and views, how to SELECT and JOIN data, how to UPDATE and DELETE data.


3 Answers

You should use custom SQL instead of Manager.raw() method:

from django.db import connection

cursor = connection.cursor()
cursor.execute('SELECT max(value) FROM mytable')
max_value = cursor.fetchone()[0]
like image 86
catavaran Avatar answered Nov 15 '22 08:11

catavaran


U can use

ModelName.objects.raw('SELECT 1 as id , max(value) FROM mytable')
like image 33
Tinashe Robert Avatar answered Nov 15 '22 07:11

Tinashe Robert


I just had same issue, @Tinashe Robert's not working for me. May I share my solution(Python 3.7.7, Django 3.0.5):

PS: Yes, @catavaran's answer is native Python, but if one really want to use RAW?

  • Firstly, primary_key needs to be appointed to one column in MySQL DB field, i.e.:
 class MyTable(model.Model):
     id = models.AutoField(primary_key=True)
     value = models.IntegerField(10)
     ... 

with API shell:

>>> MyTable.objects.raw('SELECT id, max(value) as mx from MyTable')[0].mx

or simpler solution:

>>> from django.db.models import Max
>>> MyTable.objects.all().aggregate(Max('value'))
like image 31
NegaOverflow Avatar answered Nov 15 '22 07:11

NegaOverflow