Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retrieve analyzed tokens from ElasticSearch documents

Trying to access the analyzed/tokenized text in my ElasticSearch documents.

I know you can use the Analyze API to analyze arbitrary text according your analysis modules. So I could copy and paste data from my documents into the Analyze API to see how it was tokenized.

This seems unnecessarily time consuming, though. Is there any way to instruct ElasticSearch to returned the tokenized text in search results? I've looked through the docs and haven't found anything.

like image 384
Clay Wardell Avatar asked Nov 15 '12 19:11

Clay Wardell


People also ask

How do I retrieve data from Elasticsearch?

You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .

What should you use to fetch a document in Elasticsearch?

Descriptionedit. You use GET to retrieve a document and its source or stored fields from a particular index. Use HEAD to verify that a document exists. You can use the _source resource retrieve just the document source or verify that it exists.

How do I view an Elasticsearch document?

You can view the document in two ways. The Table view displays the document fields row-by-row. The JSON (JavaScript Object Notation) view allows you to look at how Elasticsearch returns the document. The link is valid for the time the document is available in Elasticsearch.


1 Answers

This question is a litte old, but maybe I think an additional answer is necessary.

With ElasticSearch 1.0.0 the Term Vector API was added which gives you direct access to the tokens ElasticSearch stores under the hood on per document basis. The API docs are not very clear on this (only mentioned in the example), but in order to use the API you have to first indicate in your mapping definition that you want to store term vectors with the term_vector property on each field.

like image 109
Torsten Engelbrecht Avatar answered Sep 22 '22 02:09

Torsten Engelbrecht