In the logging module's RotatingFileHandler, how to set the backupCount to a practically infinite number

Tags:

python

I'm writing a program which backs up a database using Python's RotatingFileHandler. This has two parameters, maxBytes and backupCount: the former is the maximum size of each log file, and the latter the maximum number of log files.

I would like to effectively never delete data, but still have each log file a certain size (say, 2 kB for the purpose of illustration). So I tried to set the backupCount parameter to sys.maxint:

import msgpack
import json
from faker import Faker
import logging
from logging.handlers import RotatingFileHandler
import os, glob
import itertools
import sys

fake = Faker()
fake.seed(0)

data_file = "my_log.log"

logger = logging.getLogger('my_logger')
logger.setLevel(logging.DEBUG)
handler = RotatingFileHandler(data_file, maxBytes=2000, backupCount=sys.maxint)
logger.addHandler(handler)

fake_dicts = [{'name': fake.name(), 'email': fake.email()} for _ in range(100)]

def dump(item, mode='json'):
    if mode == 'json':
        return json.dumps(item)
    elif mode == 'msgpack':
        return msgpack.packb(item)

mode = 'json'

# Generate the archive log
for item in fake_dicts:
    dump_string = dump(item, mode=mode)
    logger.debug(dump_string)

However, this leads to several MemoryErrors which look like this:

Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/handlers.py", line 77, in emit
    self.doRollover()
  File "/usr/lib/python2.7/logging/handlers.py", line 129, in doRollover
    for i in range(self.backupCount - 1, 0, -1):
MemoryError
Logged from file json_logger.py, line 37

It seems like making this parameter large causes the system to use lots of memory, which is not desirable. Is there any way around this trade-off?

232

asked Oct 20 '16 09:10

Kurt Peek

2 Answers

An improvement to the solution suggested by @Asiel

Instead of using itertools and os.path.exists to determine what the nextName should be in doRollOver, the solution below simply remembers the number of last backup done and increments it to get the nextName.

from logging.handlers import RotatingFileHandler
import os
class RollingFileHandler(RotatingFileHandler):

    def __init__(self, filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=False):
        self.last_backup_cnt = 0
        super(RollingFileHandler, self).__init__(filename=filename,
                                                 mode=mode,
                                                 maxBytes=maxBytes,
                                                 backupCount=backupCount,
                                                 encoding=encoding,
                                                 delay=delay)

    # override
    def doRollover(self):
        if self.stream:
            self.stream.close()
            self.stream = None
        # my code starts here
        self.last_backup_cnt += 1
        nextName = "%s.%d" % (self.baseFilename, self.last_backup_cnt)
        self.rotate(self.baseFilename, nextName)
        # my code ends here
        if not self.delay:
            self.stream = self._open()

This class will still save your backups in an ascendant order (ex. first backup will end with ".1", the second one will end with ".2", and so on). Modifying this to do also do gzip is straight forward.

183

answered Oct 14 '22 06:10

SBK

The problem here is that RotatingFileHandler is intended to... well rotate, and actually if you set its backupCount to a big number the RotatingFileHandler.doRollover method will loop in a backward range from backupCount-1 to zero trying to find the last created backup, the bigger the backupCount the slower it will be (when you have an small number of backups)

Also the RotatingFileHandler will keep renaming your backups which isn't necessary for what you want and actually it is an overhead, instead of simply putting your latest backup with the next ".n+1" extension it will rename all your backups and put the latest backup with the extension ".1" (will shift all backup names)

Solution:

You could code the next class (probably with a better name):

from logging.handlers import RotatingFileHandler
import itertools
import os
class RollingFileHandler(RotatingFileHandler):
    # override
    def doRollover(self):
        if self.stream:
            self.stream.close()
            self.stream = None
        # my code starts here
        for i in itertools.count(1):
            nextName = "%s.%d" % (self.baseFilename, i)
            if not os.path.exists(nextName):
                self.rotate(self.baseFilename, nextName)
                break
        # my code ends here
        if not self.delay:
            self.stream = self._open()

This class will save your backups in an ascendant order (ex. first backup will end with ".1", the second one will end with ".2", and so on)

Since RollingFileHandler extends RotatingFileHandler you can simply replace RotatingFileHandler for RollingFileHandler in your code, you don't need to provide the backupCount argument since it is ignored by this new class.

Bonus solution: (compressing backups)

Since you will have an ever growing amount of log backups, you may want to compress them to save disk space. So you could create a class similar to RollingFileHandler:

from logging.handlers import RotatingFileHandler
import gzip
import itertools
import os
import shutil
class RollingGzipFileHandler(RotatingFileHandler):
    # override
    def doRollover(self):
        if self.stream:
            self.stream.close()
            self.stream = None
        # my code starts here
        for i in itertools.count(1):
            nextName = "%s.%d.gz" % (self.baseFilename, i)
            if not os.path.exists(nextName):
                with open(self.baseFilename, 'rb') as original_log:
                    with gzip.open(nextName, 'wb') as gzipped_log:
                        shutil.copyfileobj(original_log, gzipped_log)
                os.remove(self.baseFilename)
                break
        # my code ends here
        if not self.delay:
            self.stream = self._open()

This class will save your compressed backups with extensions ".1.gz", ".2.gz", etc. Also there are other compression algorithms available in the standard library if you don't want to use gzip.

This is an old question, but hope this help.

answered Oct 14 '22 05:10

Asiel Diaz Benitez

Related questions
                            
                                python-pptx: insert picture into content placeholder
                            
                                python pandas groupby sorting and concatenating
                            
                                how to halt python program after pdb.set_trace()
                            
                                Convert List of List of Tuples Into 2d Numpy Array
                            
                                scipy cdist with sparse matrices
                            
                                broadcast not supported by sql broker transport
                            
                                Aws passing credentials to ansible s3 module
                            
                                Difference between Positional , keyword, optional and required argument?
                            
                                How to combine individual characters in one string in python [duplicate]
                            
                                HDF5 min_itemsize error: ValueError: Trying to store a string with len [##] in [y] column but this column has a limit of [##]!
                            
                                How to create pie chart?
                            
                                Python printing "<built-in method ... object" instead of list
                            
                                Change Python version for evaluating file with SublimREPL plugin
                            
                                Python Flask: Go from Swagger YAML to Google App Engine?
                            
                                Numpy repeat for 2d array
                            
                                Why does adding parenthesis around a yield call in a generator allow it to compile/run?
                            
                                String split with indices in Python
                            
                                In Python, what operator to override for "if object:"?
                            
                                Where is documentation for multiprocessing.pool.ApplyResult?
                            
                                clone element with beautifulsoup

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With