I've got data coming from kafka and I want to send them to ElasticSearch. I've got a log like this with tags:
<TOTO><ID_APPLICATION>APPLI_A|PRF|ENV_1|00</ID_APPLICATION><TN>3</TN></TOTO>
I'm trying to parse it with grok
using grok debugger:
\<ID_APPLICATION\>%{WORD:APPLICATION}\|%{WORD:PROFIL}\|%{WORD:ENV}\|%{WORD:CODE}\</ID_APPLICATION\>\<TN\>%{NUMBER:TN}\</TN\>
It works, but sometimes the log has a new field like this (the one with the tag <TP>
):
<TOTO><ID_APPLICATION>APPLI_A|PRF|ENV_1|00</ID_APPLICATION><TN>3</TN><TP>new</TP></TOTO>
I'd like to get lines with this field (the TP tag) and lines without. How can I do that?
If you have an optional field, you can match it with an optional named capturing group:
(?:<TP>%{WORD:TP}</TP>)?
^^^ ^
The non-capturing group does not save any submatches in memory and is used for grouping only, and ?
quantifier matches 1 or 0 times (=optional). It will create a TP
field with a value of type word. If the field is absent, the value will be null
.
So, the whole pattern will look like:
<ID_APPLICATION>%{WORD:APPLICATION}\|%{WORD:PROFIL}\|%{WORD:ENV}\|%{WORD:CODE}</ID_APPLICATION><TN>%{NUMBER:TN}</TN>(?:<TP>%{WORD:TP}</TP>)?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With