I have a model Line
with a total_value
and a group
fields.
I use the following code to get the 10 lines with the highest values within a given group:
group_lines = Line.objects.filter(group_id=group_pk, total_value__gt=0)
sorted_lines = group_lines.order_by('-total_value')[:10]
ids = lines.values_list("id", flat=True)
My database is very large, with 10M+ lines. The group_lines
query alone returns 1000 lines.
My issue is that the values_list
query takes about 2 seconds to get executed.
If I remove the ordering, it is almost instant.
Is it normal to take that long to order 1000 objects? How can I make this query faster?
I am on Django 2.1.7, with a MySQL database.
You can add an index on the database field. This will, for most database, use a B-tree, which will boost sorting significantly:
class Line(models.Model):
# …
total_value = models.IntegerField(db_index=True)
you can also use a combined index:
class Line(models.Model):
# …
total_value = models.IntegerField(db_index=True)
class Meta:
indexes = [
models.Index(fields=['group', 'total_value'])
]
This will both boost filtering and ordering the values.
Updating an index will have time complexity ~O( log N) which is comparable to filtering, etc. and retrieving will often happen in ~O( log N) as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With