I want to add fields for specific URI params in my log lines
here is an example log line:
2017-03-12 21:34:36 W3SVC1 webserver 1.1.1.1 GET /webpage.html param1=11111¶m2=22222¶m3=¶m4=4444444 80 - 2.2.2.2 HTTP/1.1 Java/1.8.0_121 - - balh.com 200 0 0 311 244 247 - -
I want to add fields for param1, param2, param3 and param4.
I am using this grok filter:
grok {
match => [ "message", "(?<param1>param1=(.*?)&)"]
}
So this regex uses a capture group to get text between "param1=" and "&". But grok is ignoring the capture group and getting "param1=11111&" I just want to capture the "111111"
How can I say use capture group 1 or tell grok to use my regex capture group?
Edit This almost works:
grok {
match => [ "message", "(?<param1>param1=(?<param1>.*?)&)"]
}
So I guess what I'm doing here is using two named groups but with the same name. The problem is that the "param1" field has two entries in it for each group. One for "param1=11111&" and one for "11111". How do I just get that second group?
How can I say use capture group 1 or tell grok to use my regex capture group?
By default, only named capturing groups are considered by grok, numbered capturing groups do not trigger a field creation. If you want to override this behavior, set named_captures_only
to false:
named_captures_only
- Value type is boolean
- Default value istrue
Iftrue
, only store named captures from grok.
However, there is nothing wrong in using a named capturing group (and I'd use a negated character class [^&]*
instead of a lazy matching dot with a consuming &
after it):
\bparam1=(?<param1>[^&]*)
[^&]*
matches 0 or more characters other than &
, and thus will also match the empty parameter (that you may want to avoid by changing *
to +
, or control with the keep_empty_captures
parameter) and at the end of the string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With