I have a simple setup with django-haystack and whoosh engine. A search yielding 19 objects took me 8 seconds. I used the django-debug-toolbar to determine that i had a bunch of repeated queries.
I then updated my search view to prefetch relations, so that duplicate queries would not happen:
class MySearchView(SearchView):
template_name = 'search_results.html'
form_class = SearchForm
queryset = RelatedSearchQuerySet().load_all().load_all_queryset(
models.Customer, models.Customer.objects.all().select_related('customer_number').prefetch_related(
'keywords'
)
).load_all_queryset(
models.Contact, models.Contact.objects.all().select_related('customer')
).load_all_queryset(
models.Account, models.Account.objects.all().select_related(
'customer', 'account_number', 'main_contact', 'main_contact__customer'
)
).load_all_queryset(
models.Invoice, models.Invoice.objects.all().select_related(
'customer', 'end_customer', 'customer__original', 'end_customer__original', 'quote_number', 'invoice_number'
)
).load_all_queryset(
models.File, models.File.objects.all().select_related('file_number', 'customer').prefetch_related(
'keywords'
)
).load_all_queryset(
models.Import, models.Import.objects.all().select_related('import_number', 'customer').prefetch_related(
'keywords'
)
).load_all_queryset(
models.Event, models.Event.objects.all().prefetch_related('customers', 'contracts', 'accounts', 'keywords')
)
But even then, the search still takes 5 seconds. I then used the profiler from django-debug-toolbar
, which gave me this information:
From what I can tell, the issue lies in haystack/query:779::__getitem__
, which is hit twice, each costing 1.5 seconds. I have glanced through the code in question, but cannot make sense of it. So where do I go from here?
You say in the question:
I then updated my search view to prefetch relations […]
The code you present, though, does not use QuerySet.prefetch_related
for most of them. Instead, your sample code uses QuerySet.select_related
for most of them; this does not pre-fetch the objects.
The documentation for each of those methods is extensive and can help to decide which is correct for your case.
In particular, the QuerySet.prefetch_related
documentation says:
select_related
works by creating an SQL join and including the fields of the related object in theSELECT
statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship,select_related
is limited to single-valued relationships - foreign key and one-to-one.
prefetch_related
, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done usingselect_related
, in addition to the foreign key and one-to-one relationships that are supported byselect_related
. It also supports prefetching ofGenericRelation
andGenericForeignKey
, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by aGenericForeignKey
is only supported if the query is restricted to oneContentType
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With