Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch get matching documents after specific document id

When I search for documents I took the first 10 and give them to the view, if the user scrolls to the end of the list the next 10 elements should be displayed.

I know the last document id of the displayed documents, now I have to get the next 10. Basically I would perform the exact same search with an offset of 10 but it would be much better to be able to search with the same query, putting the document id of the last retrieved document to it and retrieve the matching documents after the document with that id.

Is that possible with elasticsearch?

=== UPDATE

I want to point out my issue a bit more, because it seems it is not clear enough as it is described right now. Sorry for that.

The case:

You have a kind of feed, the feed will grow every second. If a user goes to the feed he gets the most recent 10 entries, if he scrolls down he wants to get the next 10 entries.

Because the feed is growing every second, a usual offset / limit (from / size in elasticsearch) can't solve this problem, you would display already displayed entries or completely newer entries, depending on the time between first request (first 10 entries) and the request for the next entries.

The request to get the next 10 elements AFTER the already displayed entries gives the backend the id of the last entry which got displayed. The backend knows to ignore all entries before this specific one.

At the moment I'm handling this in code, I request the list with all matching entries from Elasticsearch and iterate it, this way I can do everything I want (no surprise) and extract the needed chunk of entires.

My question is: Is there is a build in solution for this issue in elasticsearch. Because solving the problem on my way is not the fastest.

like image 493
maddin2code Avatar asked Nov 08 '13 05:11

maddin2code


People also ask

What is the Elasticsearch query to get all documents from an index?

Elasticsearch will get significant slower if you just add some big number as size, one method to use to get all documents is using scan and scroll ids. The results from this would contain a _scroll_id which you have to query to get the next 100 chunk. This answer needs more updates. search_type=scan is now deprecated.

What is _ID in Elasticsearch?

_id fieldedit Each document has an _id that uniquely identifies it, which is indexed so that documents can be looked up either with the GET API or the ids query. The _id can either be assigned at indexing time, or a unique _id can be generated by Elasticsearch. This field is not configurable in the mappings.

What should you use to fetch a document in Elasticsearch?

You use GET to retrieve a document and its source or stored fields from a particular index. Use HEAD to verify that a document exists. You can use the _source resource retrieve just the document source or verify that it exists.


2 Answers

You just have to create your query DSL and a pagination system with

{ "size": 10, "from" : YOUR_OFFSET }

like image 104
remiheens Avatar answered Oct 26 '22 07:10

remiheens


It's an old topic, but it feels that Search After API, which is available since elasticsearch 5.0, does exactly what is needed. Provide an id of your last doc and it's timestamp, for example:

GET twitter/tweet/_search
{
  "size": 10,
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  },
  "search_after": [
    1463538857,
    "tweet#654323"
  ],
  "sort": [
    {
      "date": "asc"
    },
    {
      "_uid": "desc"
    }
  ]
}

Source: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-after.html

like image 27
MastaP Avatar answered Oct 26 '22 07:10

MastaP