Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exclude objects on Haystack search without the need of update_index

I need Haystack search to exclude some objects with value published=False, the way I've managed it so far is by adding an exclude(published=True) as following:

class MymodelIndex(indexes.RealTimeSearchIndex, indexes.Indexable):
    def get_queryset(self):
        return Mymodel.objects.all().exclude(published=False)

It works as expected, the problem is that I need to ./manage.py rebuild_index everytime a new object is added to the database which makes it terrible.

How can I make it without the need to run anything else ?

Notes:

Indexes on Haystack work for many models, so something like this:

search = (
    SearchQuerySet().filter(content=term)
)

returns many kinds of objects and not just one model.

Thanks

like image 578
PepperoniPizza Avatar asked Mar 12 '13 22:03

PepperoniPizza


2 Answers

Since haystack 2.4.0 you can raise haystack.exceptions.SkipDocument to skip individual records which are easily excluded using index_queryset

https://github.com/django-haystack/django-haystack/releases/tag/v2.4.0

like image 42
Aaron McMillin Avatar answered Oct 04 '22 04:10

Aaron McMillin


I recently had to do something similar to this and it was a pain in the arse. I could not find any other way to do this.

First off to address the issue of Haystack working on many models and so filter returns all matches:

Haystack handles model filtering behind the scenes using a property it indexes called django_ct, which equals the app name and model name. In my particular case it looked something like django_ct='books.Title'.

You could try filtering by doing

SearchQuerySet.filter(content=term, django_ct='app.Model')

But I don't know if it will work that way. In my particular case I had to do a raw search anyway, so I was able to add the filtering directly to that:

sqs = SearchQuerySet()
sqs = sqs.raw_search(u'(title:(%s)^500 OR author:"%s"^400 OR "%s"~2 OR (%s)) AND (django_ct:books.Title)' % term)

Regardless of how you get it, after you get your SearchQuerySet that you want to do additional filtering on without updating the index, you have to do it with your own code.

# each item in a queryset has a pk property to the model instance it references
pks = [item.pk for item in list(sqs)] # have to wrap sqs in a list otherwise it causes problems

# use those pks to create a standard django queryset object
results = Model.objects.filter(pk__in=pks)

# Now you can do any additional filtering like normal
results = results.exclude(published=False)

Of course you can combine the last two queries, I just split them out to be explicit.

It's not that much code, but it took me a long while to get it working for various reasons. Hopefully it helps you out.

like image 54
Eric Ressler Avatar answered Oct 04 '22 03:10

Eric Ressler