Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch - Return the tokens of a field

Tags:

How can I have the tokens of a particular field returned in the result

For example, A GET request

curl -XGET 'http://localhost:9200/twitter/tweet/1' 

returns

{     "_index" : "twitter",     "_type" : "tweet",     "_id" : "1",      "_source" : {         "user" : "kimchy",         "postDate" : "2009-11-15T14:12:12",         "message" : "trying out Elastic Search"     }  } 

I would like to have the tokens of '_source.message' field included in the result

like image 202
Kennedy Avatar asked Nov 01 '12 13:11

Kennedy


1 Answers

There is also another way to do it using the following script_fields script:

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/test-idx/_search?pretty=true' -d '{     "query" : {         "match_all" : { }     },     "script_fields": {         "terms" : {             "script": "doc[field].values",             "params": {                 "field": "message"             }         }      } }' 

It's important to note that while this script returns the actual terms that were indexed, it also caches all field values and on large indices can use a lot of memory. So, on large indices, it might be more useful to retrieve field values from stored fields or source and reparse them again on the fly using the following MVEL script:

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; import java.io.StringReader;  // Cache analyzer for further use cachedAnalyzer=(isdef cachedAnalyzer)?cachedAnalyzer:doc.mapperService().documentMapper(doc._type.value).mappers().indexAnalyzer();  terms=[]; // Get value from Fields Lookup //val=_fields[field].values;  // Get value from Source Lookup val=_source[field];  if(val != null) {   tokenStream=cachedAnalyzer.tokenStream(field, new StringReader(val));    CharTermAttribute termAttribute = tokenStream.addAttribute(CharTermAttribute);    while(tokenStream.incrementToken()) {      terms.add(termAttribute.toString())   };    tokenStream.close();  }  terms 

This MVEL script can be stored as config/scripts/analyze.mvel and used with the following query:

curl 'http://localhost:9200/test-idx/_search?pretty=true' -d '{     "query" : {         "match_all" : { }     },     "script_fields": {         "terms" : {             "script": "analyze",             "params": {                 "field": "message"             }         }          } }' 
like image 183
imotov Avatar answered Sep 21 '22 16:09

imotov