Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django haystack whoosh super slow

I have a simple setup with django-haystack and whoosh engine. A search yielding 19 objects took me 8 seconds. I used the django-debug-toolbar to determine that i had a bunch of repeated queries.

I then updated my search view to prefetch relations, so that duplicate queries would not happen:

class MySearchView(SearchView):
    template_name = 'search_results.html'
    form_class = SearchForm
    queryset = RelatedSearchQuerySet().load_all().load_all_queryset(
        models.Customer, models.Customer.objects.all().select_related('customer_number').prefetch_related(
            'keywords'
        )
    ).load_all_queryset(
        models.Contact, models.Contact.objects.all().select_related('customer')
    ).load_all_queryset(
        models.Account, models.Account.objects.all().select_related(
            'customer', 'account_number', 'main_contact', 'main_contact__customer'
        )
    ).load_all_queryset(
        models.Invoice, models.Invoice.objects.all().select_related(
            'customer', 'end_customer', 'customer__original', 'end_customer__original', 'quote_number', 'invoice_number'
        )
    ).load_all_queryset(
        models.File, models.File.objects.all().select_related('file_number', 'customer').prefetch_related(
            'keywords'
        )
    ).load_all_queryset(
        models.Import, models.Import.objects.all().select_related('import_number', 'customer').prefetch_related(
            'keywords'
        )
    ).load_all_queryset(
        models.Event, models.Event.objects.all().prefetch_related('customers', 'contracts', 'accounts', 'keywords')
    )

But even then, the search still takes 5 seconds. I then used the profiler from django-debug-toolbar, which gave me this information:

Django debug toolbar profiler results

From what I can tell, the issue lies in haystack/query:779::__getitem__, which is hit twice, each costing 1.5 seconds. I have glanced through the code in question, but cannot make sense of it. So where do I go from here?

like image 823
Eldamir Avatar asked Aug 24 '15 13:08

Eldamir


Video Answer


1 Answers

You say in the question:

I then updated my search view to prefetch relations […]

The code you present, though, does not use QuerySet.prefetch_related for most of them. Instead, your sample code uses QuerySet.select_related for most of them; this does not pre-fetch the objects.

The documentation for each of those methods is extensive and can help to decide which is correct for your case.

In particular, the QuerySet.prefetch_related documentation says:

select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.

prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.

like image 62
bignose Avatar answered Sep 22 '22 17:09

bignose