When to use or not use iterator() in the django ORM

Tags:

This is from the django docs on the queryset iterator() method:

A QuerySet typically caches its results internally so that repeated evaluations do not result in additional queries. In contrast, iterator() will read results directly, without doing any caching at the QuerySet level (internally, the default iterator calls iterator() and caches the return value). For a QuerySet which returns a large number of objects that you only need to access once, this can results in better performance and a significant reduction in memory.

After reading, I'm still confused: The line about increased performance and memory reduction suggests we should just use the iterator() method. Can someone give some examples of good and bad cases iterator() usage?

Even if the query results are not cached, if they really wanted to access the models more than once, can't someone just do the following?

saved_queries = list(Model.objects.all().iterator())

587

asked Oct 01 '12 21:10

Lucas Ou-Yang

2 Answers

Note the first part of the sentence you call out: For a QuerySet which returns a large number of objects that you only need to access once

So the converse of this is: if you need to re-use a set of results, and they are not so numerous as to cause a memory problem then you should not use iterator. Because the extra database round trip is always going to reduce your performance vs. using the cached result.

You could force your QuerySet to be evaluated into a list but:

it requires more typing than just saved_queries = Model.objects.all()
say you are paginating results on a web page: you will have forced all results into memory (back to possible memory problems) rather than allowing the subsequent paginator to select the slice of 20 results it needs
QuerySets are lazy, so you can have a context processor, for instance, that puts a QuerySet into the context of every request but only gets evaluated when you access it on certain requests but if you've forced evaluation that database hit happens every request

The typical web app case is for relatively small result sets (they have to be delivered to a browser in a timely fashion, so pagination or a similar technique is employed to decrease the data volume if required) so generally the standard QuerySet behaviour is what you want. As you are no doubt aware, you must store the QuerySet in a variable to get the benefit of the caching.

Good use of iterator: processing results that take up a large amount of available memory (lots of small objects or fewer large objects). In my experience this is often in management commands when doing heavy data processing.

175

answered Sep 29 '22 22:09

Steven

I agree with Steven and I would like to had an observation:

"it requires more typing than just saved_queries = Model.objects.all()". Yes it does but there is a major difference why you should use list(Model.objects.all()). Let me give you an example, if you put the that assigned to a variable, it will execute the query and than save it there, let's imagine you have +1M records, so that means, you will have +1M records in a list that you may or may not use immediately after, so I would recommend only using as Steven said, only using Model.objects.all(), because this assigned to a variable, it won't execute until you call the variable, saving you DB calls.
You should use the prefetch_related() to save you from doing too many calls into a Database and therefore, it will use the Django reverse lookup to help you and save you tons of time.

answered Sep 29 '22 20:09

Tiago Silva

Related questions
                            
                                How to crop biggest rectangle out of an image
                            
                                Making sure a Python script with subprocesses dies on SIGINT
                            
                                What is the python equivalent to a Java .jar file?
                            
                                Faster way to read Excel files to pandas dataframe
                            
                                How can I use C++ class in Python?
                            
                                Why do -1 and -2 both hash to -2 in CPython? [duplicate]
                            
                                Flask and React routing
                            
                                Python: Typehints for argparse.Namespace objects
                            
                                Why not always use psyco for Python code?
                            
                                A data-structure for 1:1 mappings in python?
                            
                                Control the size TextArea widget look in django admin
                            
                                Running pytest test functions inside a jupyter notebook
                            
                                Why are single type constraints disallowed in Python?
                            
                                Quicker to os.walk or glob?
                            
                                AWS Cognito as Django authentication back-end for web site
                            
                                Comparing XML in a unit test in Python
                            
                                does close() imply flush() in Python?
                            
                                ConfigParser vs. import config
                            
                                Django Debug Toolbar: understanding the time panel
                            
                                Python: intersection indices numpy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When to use or not use iterator() in the django ORM

Tags:

python

iterator

orm

django

django-queryset

Lucas Ou-Yang

People also ask

2 Answers

Steven

Tiago Silva

Recent Activity

Donate For Us