I am dealing with a Queryset of over 5 million + items (For batch ML purposes) and I need to split the queryset (so I can perform multithreading operations) without evaluating the queryset as I only ever need to access each item in the queryset once and thus I don't want to cache the queryset items which evaluating causes.
Is it possible to select the items into one queryset and split this without evaluating? or am I going to have to approach it by querying for multiple querysets using Limits [:size] to achieve this behaviour?
N.B: I am aware that an Iterable can be used to cycle through a queryset without evaluating it but my question is related to how I can I split a queryset (if possible) to then run an iterable on each of the splitted querysets.
Django provides a few classes that help you manage paginated data – that is, data that’s split across several pages, with “Previous/Next” links:
from django.core.paginator import Paginator
object_list = MyModel.objects.all()
paginator = Paginator(object_list, 10) # Show 10 objects per page, you can choose any other value
for i in paginator.page_range(): # A 1-based range iterator of page numbers, e.g. yielding [1, 2, 3, 4].
    data = iter(paginator.get_page(i))
    # use data
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With