Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grok - parsing optional fields

I've got data coming from kafka and I want to send them to ElasticSearch. I've got a log like this with tags:

<TOTO><ID_APPLICATION>APPLI_A|PRF|ENV_1|00</ID_APPLICATION><TN>3</TN></TOTO>

I'm trying to parse it with grok using grok debugger:

\<ID_APPLICATION\>%{WORD:APPLICATION}\|%{WORD:PROFIL}\|%{WORD:ENV}\|%{WORD:CODE}\</ID_APPLICATION\>\<TN\>%{NUMBER:TN}\</TN\>

It works, but sometimes the log has a new field like this (the one with the tag <TP>):

<TOTO><ID_APPLICATION>APPLI_A|PRF|ENV_1|00</ID_APPLICATION><TN>3</TN><TP>new</TP></TOTO>

I'd like to get lines with this field (the TP tag) and lines without. How can I do that?

like image 906
David Avatar asked Jan 12 '16 15:01

David


1 Answers

If you have an optional field, you can match it with an optional named capturing group:

(?:<TP>%{WORD:TP}</TP>)?
^^^                    ^

The non-capturing group does not save any submatches in memory and is used for grouping only, and ? quantifier matches 1 or 0 times (=optional). It will create a TP field with a value of type word. If the field is absent, the value will be null.

So, the whole pattern will look like:

<ID_APPLICATION>%{WORD:APPLICATION}\|%{WORD:PROFIL}\|%{WORD:ENV}\|%{WORD:CODE}</ID_APPLICATION><TN>%{NUMBER:TN}</TN>(?:<TP>%{WORD:TP}</TP>)?
like image 111
Wiktor Stribiżew Avatar answered Sep 29 '22 17:09

Wiktor Stribiżew