Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I match a newline in grok/logstash?

I have a remote machine that combines multiline events and sends them across the lumberjack protocol.

What comes in is something that looks like this:

{      "message" => "2014-10-20T20:52:56.133+0000 host 2014-10-20 15:52:56,036 [ERROR   ][app.logic     ] Failed to turn message into JSON\nTraceback (most recent call last):\n  File \"somefile.py", line 249, in _get_values\n    return r.json()\n  File \"/path/to/env/lib/python3.4/site-packages/requests/models.py\", line 793, in json\n    return json.loads(self.text, **kwargs)\n  File \"/usr/local/lib/python3.4/json/__init__.py\", line 318, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/local/lib/python3.4/json/decoder.py\", line 343, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/local/lib/python3.4/json/decoder.py\", line 361, in raw_decode\n    raise ValueError(errmsg(\"Expecting value\", s, err.value)) from None\nValueError: Expecting value: line 1 column 1 (char 0), Failed to turn message into JSON" } 

When I try to match the message with

grok {              match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} \[%LOGLEVEL:loglevel}%{    SPACE}\]\[%{NOTSPACE:module}%{SPACE}\]%{GREEDYDATA:message}" ] } 

the GREEDYDATA is not nearly as greedy as I would like.

So then I tried to use gsub:

mutate {     gsub => ["message", "\n", "LINE_BREAK"] } # Grok goes here mutate {     gsub => ["message", "LINE_BREAK", "\n"] } 

but that one didn't work rather than

The Quick brown fox jumps over the lazy groks 

I got

The Quick brown fox\njumps over the lazy\ngroks 

So...

How do I either add the newline back to my data, make the GREEDYDATA match my newlines, or in some other way grab the relevant portion of my message?

like image 858
Wayne Werner Avatar asked Oct 20 '14 21:10

Wayne Werner


People also ask

How do I use regular expression in grok?

The regex parser uses named groups in regular expressions to extract field values from each line of text. You can use grok syntax (i.e. %{PATTERN_NAME:field_name} ) to build complex expressions taking advantage of the built-in patterns provided by Panther or by defining your own.

How do you define grok pattern?

Grok leverages regular expression language that allows you to name existing patterns and/or combine them into more complex Grok patterns. Because Grok is based on regular expressions, any valid regular expressions (regexp) are also valid in grok.

What is a grok parser?

Grok is a pattern matching syntax that you can use to parse arbitrary text and structure it. Grok is good for parsing syslog, apache, and other webserver logs, mysql logs, and in general, any log format that is written for human consumption.


2 Answers

All GREEDYDATA is is .*, but . doesn't match newline, so you can replace %{GREEDYDATA:message} with (?<message>(.|\r|\n)*)and get it to be truly greedy.

like image 82
Alcanzar Avatar answered Sep 30 '22 02:09

Alcanzar


Adding the regex flag to the beginning allows for matching newlines:

match => [ "message", "(?m)%{TIMESTA... 
like image 20
Wayne Werner Avatar answered Sep 30 '22 02:09

Wayne Werner