Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter log file entries based on date range

My server is having unusually high CPU usage, and I can see Apache is using way too much memory. I have a feeling, I'm being DOS'd by a single IP - maybe you can help me find the attacker?

I've used the following line, to find the 10 most "active" IPs:

cat access.log | awk '{print $1}' |sort  |uniq -c |sort -n |tail 

The top 5 IPs have about 200 times as many requests to the server, as the "average" user. However, I can't find out if these 5 are just very frequent visitors, or they are attacking the servers.

Is there are way, to specify the above search to a time interval, eg. the last two hours OR between 10-12 today?

Cheers!

UPDATED 23 OCT 2011 - The commands I needed:

Get entries within last X hours [Here two hours]

awk -vDate=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` ' { if ($4 > Date) print Date FS $4}' access.log 

Get most active IPs within the last X hours [Here two hours]

awk -vDate=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` ' { if ($4 > Date) print $1}' access.log | sort  |uniq -c |sort -n | tail 

Get entries within relative timespan

awk -vDate=`date -d'now-4 hours' +[%d/%b/%Y:%H:%M:%S` -vDate2=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` ' { if ($4 > Date && $4 < Date2) print Date FS Date2 FS $4}' access.log 

Get entries within absolute timespan

awk -vDate=`date -d '13:20' +[%d/%b/%Y:%H:%M:%S` -vDate2=`date -d'13:30' +[%d/%b/%Y:%H:%M:%S` ' { if ($4 > Date && $4 < Date2) print $0}' access.log  

Get most active IPs within absolute timespan

awk -vDate=`date -d '13:20' +[%d/%b/%Y:%H:%M:%S` -vDate2=`date -d'13:30' +[%d/%b/%Y:%H:%M:%S` ' { if ($4 > Date && $4 < Date2) print $1}' access.log | sort  |uniq -c |sort -n | tail 
like image 461
sqren Avatar asked Oct 09 '11 19:10

sqren


People also ask

How to check logs for particular time in Linux?

Like any other OS, you can use certain commands to see Linux log files. Linux logs will display with the command cd/var/log. Then, you can type ls to see the logs stored under this directory. One of the most important logs to view is the syslog, which logs everything but auth-related messages.

How do you grep a log file within a specific time period in Linux?

Use the tail command to get the last 2-3 records as shown below. In the above log the date format is 20/Aug/2021:07:23:07 that is DD/MMM/YYYY:HH:MM:SS. Now here is the awk command to extract data for the last 2 minutes. In the above command, %d/%b/%Y:%H:%M:%S is the format specifier of your date column.


2 Answers

yes, there are multiple ways to do this. Here is how I would go about this. For starters, no need to pipe the output of cat, just open the log file with awk.

awk -vDate=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` '$4 > Date {print Date, $0}' access_log 

assuming your log looks like mine (they're configurable) than the date is stored in field 4. and is bracketed. What I am doing above is finding everything within the last 2 hours. Note the -d'now-2 hours' or translated literally now minus 2 hours which for me looks something like this: [10/Oct/2011:08:55:23

So what I am doing is storing the formatted value of two hours ago and comparing against field four. The conditional expression should be straight forward.I am then printing the Date, followed by the Output Field Separator (OFS -- or space in this case) followed by the whole line $0. You could use your previous expression and just print $1 (the ip addresses)

awk -vDate=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` '$4 > Date {print $1}' | sort  |uniq -c |sort -n | tail 

If you wanted to use a range specify two date variables and construct your expression appropriately.

so if you wanted do find something between 2-4hrs ago your expression might looks something like this

awk -vDate=`date -d'now-4 hours' +[%d/%b/%Y:%H:%M:%S` -vDate2=`date -d'now-2 hours' +[%d/%b/%Y:%H:%M:%S` '$4 > Date && $4 < Date2 {print Date, Date2, $4} access_log' 

Here is a question I answered regarding dates in bash you might find helpful. Print date for the monday of the current week (in bash)

like image 112
matchew Avatar answered Sep 20 '22 14:09

matchew


As this is a common perl task

And because this is not exactly same than extract last 10 minutes from logfile where it's about a bunch of time upto the end of logfile.

And because I've needed them, I (quickly) wrote this:

#!/usr/bin/perl -ws # This script parse logfiles for a specific period of time  sub usage {     printf "Usage: %s -s=<start time> [-e=<end time>] <logfile>\n";     die $_[0] if $_[0];     exit 0; }  use Date::Parse;  usage "No start time submited" unless $s; my $startim=str2time($s) or die;  my $endtim=str2time($e) if $e; $endtim=time() unless $e;  usage "Logfile not submited" unless $ARGV[0]; open my $in, "<" . $ARGV[0] or usage "Can't open '$ARGV[0]' for reading"; $_=<$in>; exit unless $_; # empty file # Determining regular expression, depending on log format my $logre=qr{^(\S{3}\s+\d{1,2}\s+(\d{2}:){2}\d+)}; $logre=qr{^[^\[]*\[(\d+/\S+/(\d+:){3}\d+\s\+\d+)\]} unless /$logre/;  while (<$in>) {     /$logre/ && do {         my $ltim=str2time($1);         print if $endtim >= $ltim && $ltim >= $startim;     }; }; 

This could be used like:

./timelapsinlog.pl -s=09:18 -e=09:24 /path/to/logfile 

for printing logs between 09h18 and 09h24.

./timelapsinlog.pl -s='2017/01/23 09:18:12' /path/to/logfile 

for printing from january 23th, 9h18'12" upto now.

In order to reduce perl code, I've used -s switch to permit auto-assignement of variables from commandline: -s=09:18 will populate a variable $s wich will contain 09:18. Care to not miss the equal sign = and no spaces!

Nota: This hold two diffent kind of regex for two different log standard. If you require different date/time format parsing, either post your own regex or post a sample of formatted date from your logfile

^(\S{3}\s+\d{1,2}\s+(\d{2}:){2}\d+)         # ^Jan  1 01:23:45 ^[^\[]*\[(\d+/\S+/(\d+:){3}\d+\s\+\d+)\]    # ^... [01/Jan/2017:01:23:45 +0000] 
like image 30
F. Hauri Avatar answered Sep 21 '22 14:09

F. Hauri