Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logstash Grok filter for uwsgi logs

I'm a new user to ELK stack. I'm using UWSGI as my server. I need to parse my uwsgi logs using Grok and then analyze them.

Here is the format of my logs:-

[pid: 7731|app: 0|req: 357299/357299] ClientIP () {26 vars in 511 bytes} [Sun Mar  1 07:47:32 2015] GET /?file_name=123&start=0&end=30&device_id=abcd&verif_id=xyzsghg => generated 28 bytes in 1 msecs (HTTP/1.0 200) 2 headers in 79 bytes (1 switches on core 0)

I used this link to generate my filter, but it didn't parse much of the information.

The filter generated by the above link is

%{SYSLOG5424SD} %{IP} () {26 vars in 511 bytes} %{SYSLOG5424SD} GET %{URIPATHPARAM} => generated 28 bytes in 1 msecs (HTTP%{URIPATHPARAM} 200) 2 headers in 79 bytes (1 switches on core 0)

Here is my logstash-conf file.

input { stdin { } }

filter {
  grok {
    match => { "message" => "%{SYSLOG5424SD} %{IP} () {26 vars in 511 bytes} %{SYSLOG5424SD} GET %{URIPATHPARAM} => generated 28 bytes in 1 msecs (HTTP%{URIPATHPARAM} 200) 2 headers in 79 bytes (1 switches on core 0)" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  stdout { codec => rubydebug }
}

Upon running logstash with this conf file, I get an error message saying:-

{
       "message" => "[pid: 7731|app: 0|req: 357299/357299] ClientIP () {26 vars in 511 bytes} [Sun Mar  1 07:47:32 2015] GET /?file_name=123&start=0&end=30&device_id=abcd&verif_id=xyzsghg => generated 28 bytes in 1 msecs (HTTP/1.0 200) 2 headers in 79 bytes (1 switches on core 0)",
      "@version" => "1",
    "@timestamp" => "2015-03-01T07:57:02.291Z",
          "host" => "cube26-Inspiron-3542",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

The date has been properly formatted. How do I extract other information from my logs, such as my query parameters(filename, start,end, deviceid etc) and ClientIP , Response code etc.

Also, is there any built-in UWSGI log parser which can be used, such as the one built for apache and syslog?

EDIT

I wrote this on my own, but it throws the same error:

%{SYSLOG5424SD} %{IP:client_ip} () {%{NUMBER:vars} vars in %{NUMBER:bytes} bytes} %{SYSLOGTIMESTAMP:date} %{WORD:method} %{URIPATHPARAM:request} => generated %{NUMBER:generated_bytes} bytes in {NUMBER:secs} msecs (HTTP/1.0 %{NUMBER:response_code}) %{NUMBER:headers} headers in %{NUMBER:header_bytes} (1 switches on core 0)

EDIT 2

I'm finally able to crack it myself. The GROK filter for the above log will be:

\[pid: %{NUMBER:pid}\|app: %{NUMBER:app}\|req: %{NUMBER:req_num1}/%{NUMBER:req_num2}\] %{IP:client_ip} \(\) \{%{NUMBER:vars} vars in %{NUMBER:bytes} bytes\} %{SYSLOG5424SD} %{WORD:method} /\?file_name\=%{NUMBER:file_name}\&start\=%{NUMBER:start}\&end\=%{NUMBER:end} \=\> generated %{NUMBER:generated_bytes} bytes in %{NUMBER:secs} msecs \(HTTP/1.0 %{NUMBER:response_code}\) %{NUMBER:headers} headers in %{NUMBER:header_bytes}

But my questions still remain:

  1. is there any default uwsgi log filter in grop??**

  2. I've been applying different matches for different query parameters. Is there anything in grok that fetches the different query parameters by itself??

like image 328
PythonEnthusiast Avatar asked Mar 01 '15 08:03

PythonEnthusiast


2 Answers

I found the solution for extracting the query parameters:-

Here is my final configuration:-

For log line

[pid: 7731|app: 0|req: 426435/426435] clientIP () {28 vars in 594 bytes} [Mon Mar  2 06:43:08 2015] GET /?file_name=wqvqwv&start=0&end=30&device_id=asdvqw&verif_id=qwevqwr&lang=English&country=in => generated 11018 bytes in 25 msecs (HTTP/1.0 200) 2 headers in 82 bytes (1 switches on core 0)

the configuration is

input { stdin { } }

filter {
  grok {
    match => { "message" => "\[pid: %{NUMBER}\|app: %{NUMBER}\|req: %{NUMBER}/%{NUMBER}\] %{IP} \(\) \{%{NUMBER} vars in %{NUMBER} bytes\} %{SYSLOG5424SD:DATE} %{WORD} %{URIPATHPARAM} \=\> generated %{NUMBER} bytes in %{NUMBER} msecs \(HTTP/1.0 %{NUMBER}\) %{NUMBER} headers in %{NUMBER}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
  kv {
    field_split => "&? "
    include_keys => [ "file_name", "device_id", "lang", "country"]
  }
}


output {
  stdout { codec => rubydebug }
  elasticsearch { host => localhost }
}
like image 132
PythonEnthusiast Avatar answered Sep 30 '22 13:09

PythonEnthusiast


I found your solution did't support HTTP/1.1. I fixed it and also add variables name. Ref

Here's my grok config:

grok {
  match => { "message" => "\[pid: %{NUMBER:pid}\|app: %{NUMBER:id}\|req: %{NUMBER:currentReq}/%{NUMBER:totalReq}\] %{IP:remoteAddr} \(%{WORD:remoteUser}?\) \{%{NUMBER:CGIVar} vars in %{NUMBER:CGISize} bytes\} %{SYSLOG5424SD:timestamp} %{WORD:method} %{URIPATHPARAM:uri} \=\> generated %{NUMBER:resSize} bytes in %{NUMBER:resTime} msecs \(HTTP/%{NUMBER:httpVer} %{NUMBER:status}\) %{NUMBER:headers} headers in %{NUMBER:headersSize} bytes %{GREEDYDATA:coreInfo}" }
}
date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
like image 37
Ben L. Avatar answered Sep 30 '22 14:09

Ben L.