I have a Apache access.log file, which is around 35GB in size. Grepping through it is not an option any more, without waiting a great deal.
I wanted to split it in many small files, by using date as splitting criteria.
Date is in format [15/Oct/2011:12:02:02 +0000]
. Any idea how could I do it using only bash scripting, standard text manipulation programs (grep, awk, sed, and likes), piping and redirection?
Input file name is access.log
. I'd like output files to have format such as access.apache.15_Oct_2011.log
(that would do the trick, although not nice when sorting.)
Linux systems typically save their log files under /var/log directory. This works fine, but check if the application saves under a specific directory under /var/log . If it does, great. If not, you may want to create a dedicated directory for the app under /var/log .
Use the tail command to get the last 2-3 records as shown below. In the above log the date format is 20/Aug/2021:07:23:07 that is DD/MMM/YYYY:HH:MM:SS. Now here is the awk command to extract data for the last 2 minutes. In the above command, %d/%b/%Y:%H:%M:%S is the format specifier of your date column.
/var/log/messages - This file has all the global system messages located inside, including the messages that are logged during system startup. Depending on how the syslog config file is sent up, there are several things that are logged in this file including mail, cron, daemon, kern, auth, etc.
One way using awk
:
awk 'BEGIN {
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ", months, " ")
for (a = 1; a <= 12; a++)
m[months[a]] = sprintf("%02d", a)
}
{
split($4,array,"[:/]")
year = array[3]
month = m[array[2]]
print > FILENAME"-"year"_"month".txt"
}' incendiary.ws-2009
This will output files like:
incendiary.ws-2010-2010_04.txt
incendiary.ws-2010-2010_05.txt
incendiary.ws-2010-2010_06.txt
incendiary.ws-2010-2010_07.txt
Against a 150 MB log file, the answer by chepner took 70 seconds on an 3.4 GHz 8 Core Xeon E31270, while this method took 5 seconds.
Original inspiration: "How to split existing apache logfile by month?"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With