Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing apache log files

Tags:

python

file-io

I just started learning Python and would like to read an Apache log file and put parts of each line into different lists.

line from the file

172.16.0.3 - - [25/Sep/2002:14:04:19 +0200] "GET / HTTP/1.1" 401 - "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827"

according to Apache website the format is

%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\

I'm able to open the file and just read it as it is but I don't know how to make it read in that format so I can put each part in a list.

like image 733
ogward Avatar asked Sep 22 '12 14:09

ogward


People also ask

How do I read Apache log files?

Using Terminal Commands to Display Local Access Logs By default, you can find the Apache access log file at the following path: /var/log/apache/access. log. /var/log/apache2/access.

What is Apache log parser?

Parsing Apache logs converts the raw text produced by Apache into fields that can be indexed, searched, and analyzed. This makes it easier to oversee Apache, drill down into specific problems, or look at broader trends.

How do you analyze HTTP logs?

The http Logs Viewer tool is a great way to monitor, view, and analyze server logs. Its search and filter options make it powerful. The http Logs Viewer tool gives you the option to translate IP address to country and search and filter columns based on IP, request string, data, referrer, etc.


1 Answers

This is a job for regular expressions.

For example:

line = '172.16.0.3 - - [25/Sep/2002:14:04:19 +0200] "GET / HTTP/1.1" 401 - "" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827"' regex = '([(\d\.)]+) - - \[(.*?)\] "(.*?)" (\d+) - "(.*?)" "(.*?)"'  import re print re.match(regex, line).groups() 

The output would be a tuple with 6 pieces of information from the line (specifically, the groups within parentheses in that pattern):

('172.16.0.3', '25/Sep/2002:14:04:19 +0200', 'GET / HTTP/1.1', '401', '', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827') 
like image 84
David Robinson Avatar answered Sep 20 '22 04:09

David Robinson