Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you split a Django queryset without evaluating it?

I am dealing with a Queryset of over 5 million + items (For batch ML purposes) and I need to split the queryset (so I can perform multithreading operations) without evaluating the queryset as I only ever need to access each item in the queryset once and thus I don't want to cache the queryset items which evaluating causes.

Is it possible to select the items into one queryset and split this without evaluating? or am I going to have to approach it by querying for multiple querysets using Limits [:size] to achieve this behaviour?

N.B: I am aware that an Iterable can be used to cycle through a queryset without evaluating it but my question is related to how I can I split a queryset (if possible) to then run an iterable on each of the splitted querysets.

like image 910
bmjrowe Avatar asked Dec 23 '17 00:12

bmjrowe


1 Answers

Django provides a few classes that help you manage paginated data – that is, data that’s split across several pages, with “Previous/Next” links:

from django.core.paginator import Paginator

object_list = MyModel.objects.all()
paginator = Paginator(object_list, 10) # Show 10 objects per page, you can choose any other value

for i in paginator.page_range(): # A 1-based range iterator of page numbers, e.g. yielding [1, 2, 3, 4].
    data = iter(paginator.get_page(i))
    # use data
like image 115
Dhia Avatar answered Sep 18 '22 09:09

Dhia