I have a query set of approximately 1500 records from a Django ORM query. I have used the select_related() and only() methods to make sure the query is tight. I have also used connection.queries to make sure there is only this one query. That is, I have made sure no extra queries are getting called on each iteration.
When I run the query cut and paste from connection.queries it runs in 0.02 seconds. However, it takes seven seconds to iterate over those records and do nothing with them (pass).
What can I do to speed this up? What causes this slowness?
Use bulk query. Use bulk queries to efficiently query large data sets and reduce the number of database requests. Django ORM can perform several inserts or update operations in a single SQL query. If you're planning on inserting more than 5000 objects, specify batch_size.
This is because a Django QuerySet is a lazy object. It contains all of the information it needs to populate itself from the database, but will not actually do so until the information is needed.
A QuerySet can get pretty heavy when it's full of model objects. In similar situations, I've used the .values method on the queryset to specify the properties I need as a list of dictionaries, which can be much faster to iterate over.
Django documentation: values_list
1500 records is far from being a large dataset, and seven seconds is really too much. There is probably some problem in your models, you can easily check it by getting (as Brandon says) the values() query, and then create explicitly the 1500 object by iterating the dictionary. Just convert the ValuesQuerySet into a list before the construction to factor out the db connection.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With