Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suppress multiple messages with same content in Python logging module AKA log compression

By design, my application sometimes produces repeating errors which fill up the log file and make it annoying to read. It looks like that:

WARNING:__main__:CRON10: clock unset or no wind update received in 60 sec -> supressed rrd update
WARNING:__main__:CRON10: clock unset or no wind update received in 60 sec -> supressed rrd update
WARNING:__main__:CRON10: clock unset or no wind update received in 60 sec -> supressed rrd update
WARNING:__main__:CRON10: clock unset or no wind update received in 60 sec -> supressed rrd update

How can I use the Python logging module to suppress repeating messages and output something more rsyslog style (http://www.rsyslog.com/doc/rsconf1_repeatedmsgreduction.html):

WARNING:__main__:CRON10: clock unset or no wind update received in 60 sec -> supressed rrd update
--- The last message repeated 3 times

Is there a way to extend logging or do I have to write a completly own logger?

The code I use for logging is:

logging.basicConfig(format='%(asctime)s %(message)s')
logging.basicConfig(level=logging.info)
logger = logging.getLogger(__name__)
hdlr = logging.FileHandler(LOGFILE)
hdlr.setFormatter(formatter)
logger.addHandler(hdlr) 

Any ideas on that?

like image 874
Andy Schi Avatar asked Jun 22 '17 06:06

Andy Schi


People also ask

How do I create a multiple logging level in Python?

You can set a different logging level for each logging handler but it seems you will have to set the logger's level to the "lowest". In the example below I set the logger to DEBUG, the stream handler to INFO and the TimedRotatingFileHandler to DEBUG. So the file has DEBUG entries and the stream outputs only INFO.

What are the five levels of logging in Python?

Log messages can have 5 levels - DEBUG, INGO, WARNING, ERROR and CRITICAL. They can also include traceback information for exceptions.

Is logging blocking in Python?

Sometimes you have to get your logging handlers to do their work without blocking the thread you're logging from. This is common in Web applications, though of course it also occurs in other scenarios. So, although not explicitly mentioned yet logging does seem to be blocking. For details see Python Docs.


1 Answers

You can create a logging.Filter that will keep track of the last logged record and filter out any repeated (similar) records, something like:

import logging

class DuplicateFilter(logging.Filter):

    def filter(self, record):
        # add other fields if you need more granular comparison, depends on your app
        current_log = (record.module, record.levelno, record.msg)
        if current_log != getattr(self, "last_log", None):
            self.last_log = current_log
            return True
        return False

Then just add it to the logger/handler you use (i.e. hdlr.addFilter(DuplicateFilter())) or the root logger to filter all default logs. Here's a simple test:

import logging

logging.warn("my test")
logging.warn("my repeated test")
logging.warn("my repeated test")
logging.warn("my repeated test")
logging.warn("my other test")

logger = logging.getLogger()  # get the root logger
logger.addFilter(DuplicateFilter())  # add the filter to it

logging.warn("my test")
logging.warn("my repeated test")
logging.warn("my repeated test")
logging.warn("my repeated test")
logging.warn("my other test")

This will print out:

WARNING:root:my test
WARNING:root:my repeated test
WARNING:root:my repeated test
WARNING:root:my repeated test
WARNING:root:my other test
WARNING:root:my test
WARNING:root:my repeated test
WARNING:root:my other test
like image 124
zwer Avatar answered Sep 17 '22 12:09

zwer