Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which fields of the model in the django-haystack tutorial get indexed?

I'm trying to get my head around the django-haystack tutorial in order to add search functionality to my application. Unfortunately, I don't quite understand some key parts when it comes to build the search index.

In the tutorial, the following django model serves as an example:

class Note(models.Model):
    user = models.ForeignKey(User)
    pub_date = models.DateTimeField()
    title = models.CharField(max_length=200)
    body = models.TextField()

The respective index class for the Note model is this:

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    author = indexes.CharField(model_attr='user')
    pub_date = indexes.DateTimeField(model_attr='pub_date')

    def get_model(self):
        return Note

Last but not least, I'm asked to create a data template which looks like this:

{{ object.title }}
{{ object.user.get_full_name }}
{{ object.body }}

After reading the whole tutorial, I'm still confused about what is getting indexed now. As far as I understand, the contents of the fields author and pub_date will be used to create the index. The field text is simply for providing some settings. And the data template specifies how the search results will be displayed later on, i.e., which fields of the model to use to be displayed in the search results.

Is this correct or am I completely wrong? The tutorial and the documentation are quite vague in a lot of aspects in my opinion. Thank you very much in advance.

like image 644
pemistahl Avatar asked Dec 19 '12 19:12

pemistahl


2 Answers

You're right, the tutorial seems a little vague, but here's how I understand it. For each instance of the Note model, Haystack renders the data template using that instance and indexes the rendered templates. The rendered template is the "document" for the instance. The tutorial says, "This allows us to use a data template (rather than error prone concatenation) to build the document the search engine will use in searching." So if you only wanted the title field to be searchable, you would only include {{ object.title }} in the data template.

So the other fields in the NoteIndex model are used for filtering search query results. If your index model looked just like this:

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
  text = indexes.CharField(document=True, use_template=True)

you would not be able to issue a search query that says, "Give me all the Notes published in the last year where foo appears in the document text." If you include pub_date as a field in your NoteIndex (as they do in the tutorial) then you can make a query such as the following:

recent_results = SearchQuerySet().filter(content='foo').order_by('-pub_date')[:5]

which asks for the 5 most recently published documents that contain the word foo. I suppose that, without including pub_date in the NoteIndex model, you could query for content='foo' and then filter the results yourself, but I'd imagine it's a much more efficient query if you tell Haystack at indexing time about the fields you might want to filter on.

As for how the search results will be displayed, you use a different template to specify that. In the most basic Haystack usage, which they show in the tutorial, the template for displaying search results goes in search/search.html: http://django-haystack.readthedocs.org/en/latest/tutorial.html#search-template You can iterate through the search results and print out whatever fields of the model instance (result.object) that you'd like.

like image 120
Emily Avatar answered Oct 18 '22 19:10

Emily


In your class definition,

class NoteIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    author = indexes.CharField(model_attr='user')
    pub_date = indexes.DateTimeField(model_attr='pub_date')

Haystack stores the index of user attribute of database as author , and the index of the database field pub_date as pub_date in index

The template includes only the "searchable" fields. For example, you might want to save some sensitive data in the search indices you can hide it from the search by not specifying it in the template.

text can be thought of as free text search

like image 29
karthikr Avatar answered Oct 18 '22 19:10

karthikr