We are using Nagios to monitor our network with great success. However, we have a syslog for critical application errors and while I set up check_log, it doesn't seem to work as well as monitering a device.
The issues are:
Is nagios the wrong tool, or are we just not setting up the service monitering right?
Here are my entries
# log file define command{ command_name check_log command_line $USER1$/check_log -F /var/log/applications/appcrit.log -O /tmp/appcrit.log -q ? } # Define the log monitering service define service{ name logfile-check ; use generic-service ; check_period 24x7 ; max_check_attempts 1 ; normal_check_interval 5 ; retry_check_interval 1 ; contact_groups admins ; notification_options w,u,c,r ; notification_period 24x7 ; register 0 ; } define service{ use logfile-check host_name localhost service_description CritLogFile check_command check_log }
Nagios provides complete monitoring of log files, application logs, event logs, service logs, and system logs on Windows servers, Linux servers, and Unix servers. Nagios is capable of monitoring system logs, application logs, log files, and syslog data, and alerting you when a log pattern is detected.
Nagios Log Server is the most powerful and trusted IT log analysis tool on the market.
The log file is usually located at /usr/local/nagios/var/nagios. log or /var/log/nagios3/nagios. log.
For monitoring logs with Nagios, typically the log checker will return a warning only for newly discovered error messages each time it is invoked (so it must retain some state in order to know to ignore them on subsequent runs). Therefore I usually set:
max_check_attempts 1 is_volatile 1
This causes Nagios to send out the alert immeidately, but only once, and then go back to normal.
My favorite log checker is logwarn, but I'm biased because I wrote it myself after not finding any existing ones that I liked. The logwarn package includes a Nagios plugin.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With