Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confused with syslog message format

I am a bit confused about syslog message format. I have to write a program that parses syslog messages. When I read what I get in my syslog-ng instance I get messages like this:

Jan 12 06:30:00 1.2.3.4 apache_server: 1.2.3.4 - - [12/Jan/2011:06:29:59 +0100] "GET /foo/bar.html HTTP/1.1" 301 96 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 ( .NET CLR 3.5.30729)" PID 18904 Time Taken 0

I can clearly determine the real message (which is, in this case an Apache access log message) The rest is metadata about the syslog message itself.

However when I read the RFC 5424 the message examples look like:

without structured data

 <34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8

or with structured data

<165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] BOMAn application event log entry...

So now I am a bit confused. What is the correct syslog message format ? It is a matter of spec version where RFC 5424 obsoleted RFC 3164 ?

like image 473
qwix Avatar asked Feb 09 '12 10:02

qwix


People also ask

What is the format of a syslog message?

A Syslog message has the following format: A header, followed by structured-data (SD), followed by a message.

What is BSD format in syslog?

A BSD-syslog message consists of the following parts: PRI - represents the Facility and Severity of the message. It's a calculated value: Facility * 8 + Severity. HEADER - contains a timestamp and the hostname (without the domain name) or the IP address of the device.

What is RFC3164 format?

RFC3164 a.k.a. “the old format”It represents the facility number multiplied by 8, to which severity is added. In this case, facility=4 (Auth) and severity=2 (Critical). Oct 11 22:14:15 is commonly known as syslog timestamp. It misses the year, the time-zone and doesn't have sub-second information.

How are syslog messages sent?

Syslog messages are sent via User Datagram Protocol (UDP), port 514. UDP is what is called a connectionless protocol, so messages aren't acknowledged or guaranteed to arrive. This can be a drawback but also leaves the system simple and easy to manage.


2 Answers

The problem in this case is that apache is logging via the standard syslog(3) or via logger. This only supports the old (RFC3164) syslog format, i.e. there is no structured data here. In order to have the fields from the apache log show up as RFC5424 structured data, apache would need to format the log that way.

The first example is not proper RFC3164 syslog, because the priority value is stripped from the header. Proper RFC3164 format would look like this:

<34>Jan 12 06:30:00 1.2.3.4 apache_server: 1.2.3.4 - - [12/Jan/2011:06:29:59 +0100] "GET /foo/bar.html HTTP/1.1" 301 96 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12 ( .NET CLR 3.5.30729)" PID 18904 Time Taken 0

Traditionally rfc3164 syslog messages are saved to files with the priority value removed.

The other two are in RFC5424 format.

like image 122
b0ti Avatar answered Sep 24 '22 14:09

b0ti


If you have access to the installed syslog-daemon on the system you could configure it to write the logs (received both locally or via network) in a different format. rsyslogd for instance allows to configure your own format (just write a template) and also if I remember correctly has a built-in template to store in json format. And there are libraries in almost any language to parse json.

EDIT: You could also make rsyslogd part of your program. rsyslog is very good in reading incoming syslogs in either of the two RFC formats. You can then use rsyslog to output the message in JSON. This way rsyslog does all the decompositioning of the message for you.

like image 27
Alexander Stumpf Avatar answered Sep 23 '22 14:09

Alexander Stumpf