We are using Nagios to monitor our network with great success. However, we have a syslog for critical application errors and while I set up check_log, it doesn't seem to work as well as monitering a device. The issues are: <ul> <li>It only shows the last entry</li> <li>There doesn't seem to be a way to acknowledge the critical error and return the monitor to a good state</li> </ul> Is nagios the wrong tool, or are we just not setting up the service monitering right? Here are my entries <pre class="prettyprint"><code># log file define command{ command_name check_log command_line $USER1$/check_log -F /var/log/applications/appcrit.log -O /tmp/appcrit.log -q ? } # Define the log monitering service define service{ name logfile-check ; use generic-service ; check_period 24x7 ; max_check_attempts 1 ; normal_check_interval 5 ; retry_check_interval 1 ; contact_groups admins ; notification_options w,u,c,r ; notification_period 24x7 ; register 0 ; } define service{ use logfile-check host_name localhost service_description CritLogFile check_command check_log } </code></pre>

For monitoring logs with Nagios, typically the log checker will return a warning only for newly discovered error messages each time it is invoked (so it must retain some state in order to know to ignore them on subsequent runs). Therefore I usually set: <pre class="prettyprint"><code>max_check_attempts 1 is_volatile 1 </code></pre> This causes Nagios to send out the alert immeidately, but only once, and then go back to normal. My favorite log checker is logwarn, but I'm biased because I wrote it myself after not finding any existing ones that I liked. The logwarn package includes a Nagios plugin.

How do I use Nagios to monitor a log file

Tags:

We are using Nagios to monitor our network with great success. However, we have a syslog for critical application errors and while I set up check_log, it doesn't seem to work as well as monitering a device.

The issues are:

It only shows the last entry
There doesn't seem to be a way to acknowledge the critical error and return the monitor to a good state

Is nagios the wrong tool, or are we just not setting up the service monitering right?

Here are my entries

Click to copy

# log file define command{         command_name    check_log         command_line    $USER1$/check_log -F /var/log/applications/appcrit.log -O /tmp/appcrit.log -q ? }   # Define the log monitering service define service{         name                            logfile-check           ;         use                             generic-service         ;         check_period                    24x7                    ;         max_check_attempts              1                       ;         normal_check_interval           5                       ;         retry_check_interval            1                       ;         contact_groups                  admins                  ;         notification_options            w,u,c,r                 ;         notification_period             24x7                    ;         register                        0                       ;         }  define service{         use                             logfile-check         host_name                       localhost         service_description             CritLogFile         check_command                   check_log }

471

asked Mar 03 '10 16:03

Kenoyer130

1 Answers

For monitoring logs with Nagios, typically the log checker will return a warning only for newly discovered error messages each time it is invoked (so it must retain some state in order to know to ignore them on subsequent runs). Therefore I usually set:

Click to copy

max_check_attempts              1 is_volatile                     1

This causes Nagios to send out the alert immeidately, but only once, and then go back to normal.

My favorite log checker is logwarn, but I'm biased because I wrote it myself after not finding any existing ones that I liked. The logwarn package includes a Nagios plugin.

196

answered Nov 01 '22 11:11

Archie

Related questions
                            
                                Tracking the death of a child process
                            
                                WCF Exception Handling Strategies
                            
                                Does it make sense to use BOTH mongodb and mysql in the same rails application?
                            
                                LaTeX hyperref link goes to wrong page
                            
                                Visual Studio 2010, how to build projects in parallel on multicore
                            
                                jquery-ui, Use dialog('open') and pass a variable to the DIALOG
                            
                                Calling an overridden method from a parent class ctor
                            
                                @Path and regular expression (Jersey/REST)
                            
                                Why won't my breakpoints work in Qt Creator
                            
                                Which is faster in SQL, While loop, Recursive Stored proc, or Cursor?
                            
                                What does .do webpage stands for
                            
                                capturing groups in sed

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With