Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logstash grok multiline message

Tags:

regex

logstash

My logs are formatted like this:

2014-06-19 02:26:05,556 INFO ok
2014-06-19 02:27:05,556 ERROR
 message:space exception
         at line 85
 solution:increase space
          remove files   

There are 2 types of events:

-log on one line like the first

-log on multiple line like the second

I am able to process the one line event, but I am not able to process the second type, where I would like to stock the message in one variable and the solution in another.

This is my config:

input {
 file {
    path => ["logs/*"]
    start_position => "beginning"
    codec => multiline {
                   pattern => "^%{TIMESTAMP_ISO8601} "
                   negate => true
                   what => previous
    }       
 }
}
filter {
 #parsing of one line event
 grok {
 patterns_dir => "./patterns"
 match=>["message","%{TIMESTAMP_ISO8601:timestamp} %{WORD:level} ok"]
 }
#the parsing fail, so we assumed we are in multiline events, now I process them and I am stuck when I am getting to the new line.
if "_grokparsefailure" in [tags] {
 grok {
 patterns_dir => "./patterns"
 match=>["message","%{TIMESTAMP_ISO8601:timestamp} %{WORD:level}\r\n"]
 }
}

}

So this is what I have done, and I would like to have in my console output the following:

{
"@timestamp" => "2014-06-19 00:00:00,000"
"path" => "logs/test.log"
"level"=>"INFO"
},
{
"@timestamp" => "2014-06-19 00:00:00,000"
"path" => "logs/test.log"
"level"=>"ERROR"
"message" => "space exception at line 85"
"solution"=>"increase space remove files"
}

Concretely, I would like to get all the expression between two words ("message" and "solution" for the message variable, "solution" and the end of event for the solution variable), and that no matter if the expression is on one or multiple lines.

Thanks in advance

like image 840
user2443476 Avatar asked Jun 19 '14 13:06

user2443476


People also ask

What is multiline negate?

You can set the negate option to negate the pattern. multiline.negate. Defines whether the pattern is negated. The default is false .

What is TIMESTAMP_ISO8601?

TIMESTAMP_ISO8601. Matches time that is in the ISO 8601 format. DATE. Matches dates that are in the US or EU format.

What is grok in Logstash?

Put simply, grok is a way to match a line against a regular expression, map specific parts of the line into dedicated fields, and perform actions based on this mapping. Built-in, there are over 200 Logstash patterns for filtering items such as words, numbers, and dates in AWS, Bacula, Bro, Linux-Syslog and more.

What is grok expression?

Grok leverages regular expression language that allows you to name existing patterns and/or combine them into more complex Grok patterns. Because Grok is based on regular expressions, any valid regular expressions (regexp) are also valid in grok.


2 Answers

As for multiline grok, it's best to use special flag for pattern string:

grok {
    match => ["message", "(?m)%{SYSLOG5424LINE}"]
}
like image 87
Michael Korbakov Avatar answered Oct 23 '22 23:10

Michael Korbakov


It looks like you have two issues:

You need to correctly combine your multilines:

filter
{
    multiline
   {
        pattern => "^ "
        what => "previous"
   }
}

This will combine any line that begins with a space into the previous line. You may end up having to use a "next" instead of a "previous".

Replace Newlines

I don't believe that grok matches across newlines.

I got around this by doing the following in your filter section. This should go before the grok section:

mutate
{
    gsub => ["message", "\n", "LINE_BREAK"]
}

This allowed me to grok multilines as one big line rather than matching only till the "\n".

like image 13
alexpotato Avatar answered Oct 24 '22 00:10

alexpotato