I need to exclude some sensitive details in my apache log, but I want to keep the log and the uri's in it. Is it possible to achieve following in my access log:
127.0.0.1 - - [27/Feb/2012:13:18:12 +0100] "GET /api.php?param=secret HTTP/1.1" 200 7600 "http://localhost/api.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"
I want to replace "secret" with "[FILTERED]" like this:
127.0.0.1 - - [27/Feb/2012:13:18:12 +0100] "GET /api.php?param=[FILTERED] HTTP/1.1" 200 7600 "http://localhost/api.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"
I know I probably should have used POST to send this variable, but the damage is already done. I've looked at http://httpd.apache.org/docs/2.4/logs.html and LogFormat, but could not find any possibilities to use regular expression or similar. Any suggestions?
[edit]
Do NOT send sensitive variables as GET parameters if you have the possibility to choose.
Although manually finding information in Apache access. log file is adequate for small tasks, it quickly becomes cumbersome for a server with thousands of requests. It also does not offer a real-time information view for the logs. In such a case, we can use a simple tool such as goaccess to analyze logs in real-time.
Log Format In Linux, Apache commonly writes logs to the /var/log/apache2 or /var/log/httpd directories depending on your OS and Virtual Host overrides. You can also define a LogFormat string after the filename, which will only apply the format string to this file.
What is an access log? An access log is a list of all requests for individual files -- such as Hypertext Markup Language files, their embedded graphic images and other associated files that get transmitted -- that people or bots have made from a website.
What are Apache Access Logs? As mentioned above, the Apache access log is one of several log files produced by an Apache HTTP server. This particular log file is responsible for recording data for all requests processed by the Apache server.
I've found one way to solve the problem. If I pipe the log output to sed
, I can perform a regex replace on the output before I append it to the log file.
Example 1
CustomLog "|/bin/sed -E s/'param=[^& \t\n]*'/'param=\[FILTERED\]'/g >> /your/path/access.log" combined
Example 2
It's also possible to exclude several parameters:
exclude.sh
#!/bin/bash
while read x ; do
result=$x
for ARG in "$@"
do
cleanArg=`echo $ARG | sed -E 's|([^0-9a-zA-Z_])|\\\\\1|g'`
result=`echo $result | sed -E s/$cleanArg'=[^& \t\n]*'/$cleanArg'=\[FILTERED\]'/g`
done
echo $result
done
Move the script above to the folder /opt/scripts/ or somewhere else, give the script execute rights (chmod +x exclude.sh
) and modify your apache config like this:
CustomLog "|/opt/scripts/exclude.sh param param1 param2 >> /your/path/access.log" combined
Documentation
http://httpd.apache.org/docs/2.4/logs.html#piped
http://www.gnu.org/software/sed/manual/sed.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With