I want to iterate all the objects of a table(Post) I am using below code:
posts = Post.objects.all()
for post in posts:
process_post(post)
process_post
is a celery task which will run in background and its not updating post.But the problem I am having is Post table has 1 million records.This is not one time job.I am running it daily.
for post in posts
In above line, Query is called which fetches all the data from DB in one go.
How can I improve its performance? Is there any way by which data is fetched in batches?
Make your own iterator
. For Example, say 1 million
records.
count = Post.objects.all().count() #1 million
chunk_size = 1000
for i in range(0, count, chunk_size):
posts = Post.objects.all()[i:i+chunk_size]
for post in posts:
process_post(post)
Slicing on queryset will play LIMIT
, OFFSET
usages. Query can decrease as per chunk_size
increase where as memory usage also increase. Optimize it for your use case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With