Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch - previous/next functionality

I created a search engine to search all documents in my elasticsearch index. When a user hits a document on the searchengine resultpage he leaves the current page and opens the deatailpage of that document.

Now id like to implement a little document navigation on that detail page but i cant figure out how to create something like this with elasticsearch. Id like to have a previous document and a next document link on top of that document detail page.

My idea was to save all returned documents in a session cookie or something else to remember the next and the previous document on that current search. But i also have a pagination on that searchengine resultpage. When a user selects the last document on a resultpage the next link wouldnt work because my current search hasnt got any more documents.

Is this a common problem or to specific? Anybody of you got an idea which could help me to solve that problem? Maybe the scroll-API?

Thanks

like image 234
Stillmatic1985 Avatar asked Jan 20 '15 16:01

Stillmatic1985


1 Answers

The following works beautifully for me. Ensure that you are using a regularly formatted list of sort definitions like this:

function getSortDefinitions() {
    return [
        'newest' => [
            [ 'created_at' => 'desc' ],
            [ 'id' => 'desc' ],
        ],
        'oldest' => [
            [ 'created_at' => 'asc' ],
            [ 'id' => 'asc' ],
        ]
        'highest' => [
            [ 'price' => 'desc' ],
            [ 'created_at' => 'desc' ],
            [ 'id' => 'desc' ],
        ],
        'lowest' => [
            [ 'price' => 'asc' ],
            [ 'created_at' => 'asc' ],
            [ 'id' => 'asc' ],
        ],
    ];
}

An aside: Adding id makes the resultset have predictable ordering for records with the same timestamp. This happens often with testing fixtures where the records are all saved at the same time.

Now whenever someone searches, they have usually selected a few filters, perhaps a query and definitely a sort order. Create a table that stores this so you can generate a search context to work with:

create table search_contexts (
    id int primary,
    hash varchar(255) not null,
    query varchar(255) not null,
    filters json not null,
    sort varchar(255) not null,

    unique search_contexts_hash_uk (hash)
);

Use something like the following in your language of choice to insert and get a reference to the search context:

function saveSearchContext($query, $filters, $sort)
{
    // Assuming some magic re: JSON encoding of $filters
    $hash = md5(json_encode(compact('query', 'filters', 'sort')));
    return SearchContext::firstOrCreate(compact('hash', 'query', 'filters', 'sort'));
}

Notice that we only insert a search context if there isn't one already there with the same parameters. So we end up with one unique row per search. You may choose to be overwhelmed by the volume and save one per search. If you choose to do that, use uniqid instead of md5 and just create the record.

On the results index page, whenever you generate a link to the detail page, use the hash as a query parameter like this:

http://example.com/details/2456?search=7ddf32e17a6ac5ce04a8ecbf782ca509

In your detail page code, do something like this:

function getAdjacentDocument($search, $documentId, $next = true) {
    $sortDefinitions = getSortDefinitions();

    if (!$next) {
        // Reverse the sort definitions by looping through $sortDefinitions
        // and swapping asc and desc around
        $sortDefinitions = array_map($sortDefinitions, function ($defn) {
            return array_map($defn, function ($array) {
                $field = head(array_keys($array));
                $direction = $array[$field];

                $direction = $direction == 'asc' ? 'desc' : 'asc';

                return [ $field => $direction ];
            });
        });
    }

    // Add a must_not filter which will ensure that the
    // current page's document ID is *not* in the results.
    $filters['blacklist'] = $documentId;

    $params = [
        'body' => [
            'query' => generateQuery($search->query, $filters),
            'sort' => $sortDefinitions[$sort],

            // We are only interested in 1 document adjacent
            // to this one, limit results
            'size' => 1
        ]
    ];

    $response = Elasticsearch::search($params);

    if ($response['found']) {
        return $response['hits']['hits'][0];
    }
}

function getNextDocument($search, $documentId) {
    return getAdjacentDocument($search, $documentId, true);
}

function getPreviousDocument($search, $documentId) {
    return getAdjacentDocument($search, $documentId, false);
}

// Retrieve the search context given it's hash as query parameter
$searchContext = SearchContext::whereHash(Input::query('search'))->first();

// From the route segment
$documentId = Input::route('id');

$currentDocument = Elasticsearch::get([
    'id' => $documentId,
    'index' => 'documents'
]);

$previousDocument = getPreviousDocument($searchContext, $documentId);
$nextDocument = getNextDocument($searchContext, $documentId);

The key to this technique is that you are generating two searches in addition to the get for the detail record.

One search goes forwards from that record, the other goes backwards from that record, given the same search context in both cases so they work in line with eachother.

In both cases, you take the first record that is not our current record, and it should be correct.

like image 122
datashaman Avatar answered Oct 05 '22 00:10

datashaman