I am wondering what the best approach to take with my Logstash Grok filters. I have some filters that are for specific log entries, and won't apply to all entries. The ones that don't apply always generate _grokparsefailure tags. For example, I have one grok filter that's for every log entry and it works fine. Then I have another filter that's for error messages with tracebacks. The traceback filter throws a grokparsefailure for every single log entry that doesn't have a traceback.
I'd prefer to have it just pass the rule if there isn't a match instead of adding the parsefailure tag. I use the parsefailure tag to find things that aren't parsing properly, not things that simply didn't match a particular filter. Maybe it's just the nomenclature "parse failure" that gets me. To me that means there's something wrong with the filter (e.g. badly formatted), not that it didn't match.
So the question is, how should I handle this?
Make the filter pattern optional using ?
(ab)use the tag_on_failure option by setting it to nothing []
make the filter conditional using something like "if traceback in message"
something else I'm not considering?
Thanks in advance.
EDIT
I took the path of adding a conditional around the filter:
if [message] =~ /took\s\d+/ { grok { patterns_dir => "/etc/logstash/patterns" match => ["message", "took\s+(?<servicetime>[\d\.]+)"] add_tag => [ "stats", "servicetime" ] } }
Still interested in feedback though. What is considered "best practice" here?
Put simply, grok is a way to match a line against a regular expression, map specific parts of the line into dedicated fields, and perform actions based on this mapping. Built-in, there are over 200 Logstash patterns for filtering items such as words, numbers, and dates in AWS, Bacula, Bro, Linux-Syslog and more.
Grok works by combining text patterns into something that matches your logs. The SYNTAX is the name of the pattern that will match your text. For example, “3.44” will be matched by the NUMBER pattern and “55.3. 244.1” will be matched by the IP pattern.
Use the ? operator to denote "zero or one occurrence of the previous token", so e.g. (?:%{IP:ip})? (or maybe %{IP:ip}? is enough) although you probably want (?:\s+%{IP:ip}) so that the spaces are optional too.
Open the main menu, click Dev Tools, then click Grok Debugger. In Grok Pattern, enter the grok pattern that you want to apply to the data. Click Simulate. You'll see the simulated event that results from applying the grok pattern.
When possible, I'd go with a conditional wrapper just like the one you're using. Feel free to post that as an answer!
If your application produces only a few different line formats, you can use multiple match patterns with the grok filter. By default, the filter will process up to the first successful match:
grok { patterns_dir => "./patterns" match => { "message" => [ "%{BASE_PATTERN} %{EXTRA_PATTERN}", "%{BASE_PATTERN}", "%{SOME_OTHER_PATTERN}" ] } }
If your logic is less straightforward (maybe you need to check the same condition more than once), the grep filter can be useful to add a tag. Something like this:
grep { drop => false #grep normally drops non-matching events match => ["message", "/took\s\d+/"] add_tag => "has_traceback" } ... if "has_traceback" in [tags] { ... }
You can also add tag_on_failure => []
to your grok stanza like so:
grok { match => ["context", "\"tags\":\[%{DATA:apptags}\]"] tag_on_failure => [ ] }
grok will still fail, but will do so without adding to the tags array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With