Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse Apache logs using a regex in PHP

Tags:

regex

php

I'm trying to split this string in PHP:

11.11.11.11 - - [25/Jan/2000:14:00:01 +0100] "GET /1986.js HTTP/1.1" 200 932 "http://domain.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"

How can split this into IP, date, HTTP method, domain-name and browser?

like image 1000
streetparade Avatar asked Feb 08 '10 12:02

streetparade


2 Answers

This log format seems to be the Apache’s combined log format. Try this regular expression:

/^(\S+) \S+ \S+ \[([^\]]+)\] "([A-Z]+)[^"]*" \d+ \d+ "[^"]*" "([^"]*)"$/m

The matching groups are as follows:

  1. remote IP address
  2. request date
  3. request HTTP method
  4. User-Agent value

But the domain is not listed there. The second quoted string is the Referer value.

like image 197
Gumbo Avatar answered Nov 14 '22 09:11

Gumbo


You should check out a regular expression tutorial. But here is the answer:

if (preg_match('/^(\S+) \S+ \S+ \[(.*?)\] "(\S+).*?" \d+ \d+ "(.*?)" "(.*?)"/', $line, $m)) {
  $ip = $m[1];
  $date = $m[2];
  $method = $m[3];
  $referer = $m[4];
  $browser = $m[5];
}

Take care, it's not the domain name in the log but the HTTP referer.

like image 4
KARASZI István Avatar answered Nov 14 '22 09:11

KARASZI István