Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django Rest Framework: How do I order/sort a search/filter query?

I'm building out an API with Django Rest Framework, and I'd like to have a feature that allows users to search by a query. Currently, http://127.0.0.1:8000/api/v1/species/?name=human yields:

{
    count: 3,
    next: null,
    previous: null,
    results: [
        {
            id: 1,
            name: "Humanoid",
            characters: [
                {
                    id: 46,
                    name: "Doctor Princess"
                }
            ]
        },
        {
            id: 3,
            name: "Inhuman (overtime)",
            characters: [

            ]
        },
        {
            id: 4,
            name: "Human",
            characters: [
                {
                    id: 47,
                    name: "Abraham Lincoln"
                }
            ]
        }
    ]
}

It's pretty close to what I want, but not quite there. I'd like it so that the first object inside results would be the one with the id of 4 since the name field is the most relevant to the search query (?name=human). (I don't really care about how the rest is ordered.) It seems that currently it is sorting the results by ascending id. Anyone know a good way to handle this? Thanks!

Here is my api folder's views.py

class SpeciesFilter(django_filters.FilterSet):
    name = django_filters.CharFilter(name="name", lookup_type=("icontains"))
    class Meta:
        model = Species
        fields = ['name']

class SpeciesViewSet(viewsets.ModelViewSet):
    queryset = Species.objects.all()
    serializer_class = SpeciesSerializer
    filter_backends = (filters.DjangoFilterBackend,)
    # search_fields = ('name',)
    filter_class = SpeciesFilter
like image 783
pyramidface Avatar asked Jul 16 '15 23:07

pyramidface


2 Answers

You want to sort search result by relevance, in your case name: "Human" should be the best result because it exactly matchs the query word.

If it's only to solve the problem, your could use raw sql query to achieve your goal, which like:

# NOT TESTED, sql expression may vary based on which database you are using
queryset = Species.objects.raw("select * from species where lower(name) like '%human%' order by char_length(name) desc limit 20")

This query will find all record which contains "human"(ignore cases), and sort the result by length of name field desc. which name: "Human" will be the first item to show up.


FYI, Database query usually is not the best approach to do such kind of stuff, you should go check djang-haystack project which helps you build search engine upon django project, fast and simple.

like image 146
piglei Avatar answered Sep 21 '22 22:09

piglei


I agree with @piglei on django-haystack, but I think sorting by field value length is a terrible idea, and there is also no need to resort to writing SQL for that. A better way would be something like:

 Species.objects.all().extra(select={'relevance': 'char_length(full_name)', order_by=['relevance'])  # PostgreSQl

Still terrible, even as a quick fix. If you really don't want to setup django-haystack, a slightly less terrible approach would be to sort your results using python:

from difflib import SequenceMatcher

species = Species.objects.all()

species = sorted(species,
                 lambda s: SequenceMatcher(None, needle.lower(), s.name.lower()).quick_ratio(),
                 reverse=True)

I didn't test this code, so let me know if it doesn't work and also if you need help integrating it in DRF.

The reason why this is still terrible is that difflib's search algorithm differs from the one used to search the database, so you may never actually get results that would have had greater relevance using difflib than some of the ones that __icontains might find. More on that here: Is there a way to filter a django queryset based on string similarity (a la python difflib)?

Edit:

While trying to come up with an example of why sorting by field value length is a terrible idea, I've actually managed to convince myself that it may be the less terrible idea when used with __icontains. I'm gonna leave the answer like this though as it might be useful or interesting to someone. Example:

needle = 'apple'
haystack = ['apple', 'apples', 'apple computers', 'apples are nice']  # Sorted by value length
like image 26
demux Avatar answered Sep 21 '22 22:09

demux