I am aware that regular queryset or the iterator queryset methods evaluates and returns the entire data-set in one shot .
for instance, take this :
my_objects = MyObject.objects.all()
for rows in my_objects: # Way 1
for rows in my_objects.iterator(): # Way 2
Question
In both methods all the rows are fetched in a single-go.Is there any way in djago that the queryset rows can be fetched one by one from database.
Why this weird Requirement
At present my query fetches lets says n rows but sometime i get Python and Django OperationalError (2006, 'MySQL server has gone away').
so to have a workaround for this, i am currently using a weird while
looping logic.So was wondering if there is any native or inbuilt method or is my question even logical in first place!! :)
I think you are looking to limit your query set.
Quote from above link:
Use a subset of Python’s array-slicing syntax to limit your QuerySet to a certain number of results. This is the equivalent of SQL’s LIMIT and OFFSET clauses.
In other words, If you start with a count you can then loop over and take slices as you require them..
cnt = MyObject.objects.count()
start_point = 0
inc = 5
while start_point + inc < cnt:
filtered = MyObject.objects.all()[start_point:inc]
start_point += inc
Of course you may need to error handle this more..
Fetching row by row might be worse. You might want to retrieve in batches for 1000s etc. I have used this Django snippet (not my work) successfully with very large querysets. It doesn't eat up memory and no trouble with connections going away.
Here's the snippet from that link:
import gc
def queryset_iterator(queryset, chunksize=1000):
'''''
Iterate over a Django Queryset ordered by the primary key
This method loads a maximum of chunksize (default: 1000) rows in it's
memory at the same time while django normally would load all rows in it's
memory. Using the iterator() method only causes it to not preload all the
classes.
Note that the implementation of the iterator does not support ordered query sets.
'''
pk = 0
last_pk = queryset.order_by('-pk')[0].pk
queryset = queryset.order_by('pk')
while pk < last_pk:
for row in queryset.filter(pk__gt=pk)[:chunksize]:
pk = row.pk
yield row
gc.collect()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With