Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Django's QuerySets lazy enough to cope with large data sets?

I think I've read somewhere that Django's ORM lazily loads objects. Let's say I want to update a large set of objects (say 500,000) in a batch-update operation. Would it be possible to simply iterate over a very large QuerySet, loading, updating and saving objects as I go?

Similarly if I wanted to allow a paginated view of all of these thousands of objects, could I use the built in pagination facility or would I manually have to run a window over the data-set with a query each time because of the size of the QuerySet of all objects?

like image 742
Joe Avatar asked Nov 05 '22 17:11

Joe


2 Answers

If you evaluate a 500000-result queryset, which is big, it will get cached in memory. Instead, you can use the iterator() method on your queryset, which will return results as requested, without the huge memory consumption.

Also, use update() and F() objects in order to do simple batch-updates in single query.

like image 123
Dmitry Shevchenko Avatar answered Nov 11 '22 14:11

Dmitry Shevchenko


If the batch update is possible using a SQL query, then i think using sql-queries or django-orm will not make a major difference. But if the update actually requires loading each object, processing the data and then updating them, you can use the orm or write your own sql query and run update queries on each of the processed data, the overheads completely depends on the code logic.

The built-in pagination facility runs a limit,offset query (if you are doing it correct), so i don't think there are major overheads in the pagination either ..

like image 30
ranedk Avatar answered Nov 11 '22 13:11

ranedk