Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to limit the number of identical log messages?

I use the logging module to warn about issues with some routines. These routines may be run several times before getting back to normal operations (e.g. repeated queries to an API which fail, but eventually go through). Each failed call triggers a log entry.

Is there a way to limit the number of identical log messages?
I would like this limit to fire off after n identical messages being output, then possibly inform that n more were generated (so not to clutter the log file) and reset once a recovery log is received. This is the ideal scenario - I am looking at how to approach the problem to start with.

The closest I found was the conditional release of logs but I do not see how this could be adapted to my case.
Another possibility would be to put the limit at the syslog level (in rsyslog or syslog-ng) but this is a "per process" setting so I could lose useful logs (the ones which would come in between the loop-generated ones)

like image 292
WoJ Avatar asked Nov 12 '15 15:11

WoJ


1 Answers

Use a logging.Filter!

The DuplicateFilter takes two regex patterns or a string (to be compiled into one), one that should match against the string you want to filter out, and one you want to reset the filter at.

import logging
import os
import re
import sys

from sre_parse import Pattern


class DuplicateFilter(logging.Filter):
    def __init__(self, match_against, reset_at_message, hide_at_count=5, name=''):
        super(DuplicateFilter, self).__init__(name)

        if isinstance(match_against, Pattern):
            self.match_against = match_against
        else:
            self.match_against = re.compile(match_against)

        if isinstance(reset_at_message, Pattern):
            self.reset_at_message = reset_at_message
        else:
            self.reset_at_message = re.compile(reset_at_message)

        self.hide_at_count = hide_at_count

        self.count = 0

    def filter(self, record: logging.LogRecord):
        _ = super(DuplicateFilter, self).filter(record)
        if not _:
            return _

        msg = record.getMessage()

        if self.match_against.match(msg):
            self.count += 1

            if self.count >= self.hide_at_count:
                return False

        elif self.reset_at_message.match(msg):
            record.msg = os.linesep.join([
                '{:d} more generated'.format(self.count - self.hide_at_count),
                record.msg
            ])
            self.count = 0

        return True

handler = logging.StreamHandler(sys.stdout)
handler.addFilter(DuplicateFilter('Filter me!', 'Reset at me'))

logging.basicConfig(level='INFO', handlers=[handler, ])

log = logging.getLogger()

for _ in range(10):
    log.info('Filter me!')

log.info('Reset at me')

for _ in range(3):
    log.info('Filter me!')

This is the resulting log:

INFO:root:Filter me!
INFO:root:Filter me!
INFO:root:Filter me!
INFO:root:Filter me!
INFO:root:5 more generated
Reset at me
INFO:root:Filter me!
INFO:root:Filter me!
INFO:root:Filter me!

Just pre-pending the "5 more generated" message probably isn't what you want, but hopefully this is a good starting point.

like image 66
asoc Avatar answered Oct 15 '22 08:10

asoc