Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elasticsearch: extract number from a field

I'm using elasticsearch and kibana for storing my logs. Now what I want is to extract a number from a field and store it a new field.

So for instance, having this:

accountExist execution time: 1046 ms

I would like to extract the number (1046) and see it in a new field in kibana.

Is it possible? how? Thanks for the help

like image 865
Pirulino Avatar asked Oct 07 '15 17:10

Pirulino


People also ask

How do I select a specific field in Elasticsearch?

There are two recommended methods to retrieve selected fields from a search query: Use the fields option to extract the values of fields present in the index mapping. Use the _source option if you need to access the original data that was passed at index time.

What is term query in Elasticsearch?

Term queryedit. Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.

What are fields in Elasticsearch?

Fields are the smallest individual unit of data in Elasticsearch. These are customizable and could include, for example: title, author, date, summary, team, score, etc. Each field has a defined datatype and contains a single piece of data.


2 Answers

You'll need to do this before/during indexing.

Within Elasticsearch, you can get what you need during indexing:

  1. Define a new analyzer using the Pattern Analyzer to wrap a regular expression (for your purposes, to capture consecutive digits in the string - good answer on this topic).
  2. Create your new numeric field in the mapping to hold the extracted times.
  3. Use copy_to to copy the log message from the input field to the new numeric field from (2) where the new analyzer will parse it.

The Analyze API can be helpful for testing purposes.

like image 164
Peter Dixon-Moses Avatar answered Nov 15 '22 20:11

Peter Dixon-Moses


While not performant, if you must avoid reindexing, you could use scripted fields in kibana.

Introduction here: https://www.elastic.co/blog/using-painless-kibana-scripted-fields

  • enable painless regex support by putting the following in your elasticsearch.yaml:

    script.painless.regex.enabled: true

  • restart elasticsearch
  • create a new scripted field in Kibana through Management -> Index Patterns -> Scripted Fields
  • select painless as the language and number as the type
  • create the actual script, for example:
def logMsg = params['_source']['log_message'];
if(logMsg == null) {
 return -10000;
}
def m = /.*accountExist execution time: ([0-9]+) ms.*$/.matcher(params['_source']['log_message']);
if ( m.matches() ) {
   return Integer.parseInt(m.group(1))
} else {
   return -10000
}
  • you must reload the website completely for the new fields to be executed, simply re-doing a search on an open discover site will not pick up the new fields. (This almost made me quit trying to get this working -.-)
  • use the script in discover or visualizations

While I do understand, that it is not performant to script fields for millions of log entries, my usecase is a very specific log entry, that is logged 10 times a day in total and I only use the resulting fields to create a visualization or in analysis where I reduce the candidates through regular queries in advance.

Would be interesing if it is possible to have those fields only be calculated in situations where you need them (or they make sense & are computable to begin with; i.e. to make the "return -1000" unnecessary). Currently they will be applied and show up for every log entry.
You can generate scripted fields inside of queries like this: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html but that seems a bit too much of burried under the hood, to maintain easily :/

like image 39
icyerasor Avatar answered Nov 15 '22 20:11

icyerasor