Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why TimedRotatingFileHandler does not delete old files?

I am using TimedRotatingFileHandler to create my logs. I want my log files to be created every minute, keep at most 2 log files and delete older ones. Here is the sample code:

import logging
import logging.handlers
import datetime

logger = logging.getLogger('MyLogger')
logger.setLevel(logging.DEBUG)

handler = logging.handlers.TimedRotatingFileHandler(
    "logs/{:%H-%M}.log".format(datetime.datetime.now()), 
    when="M", 
    backupCount=2)

logger.addHandler(handler)
logger.debug("PLEASE DELETE PREVIOUS FILES")

If I run this code multiple times (with a minute interval) I get multiple files in my logs directory like so:

21-01.log
21-02.log
21-03.log
...

This seems strange to me, since I set backupCount=2 which indicates that at most 2 files should be saved and older files should be deleted. However when I start my application with 2 files or more in the log folder, old files are not deleted.

Why TimedRotatingFileHandler does not delete old files? Is there any way I can set TimedRotatingFileHandler to delete older files?

like image 214
Mykhailo Seniutovych Avatar asked Mar 14 '19 19:03

Mykhailo Seniutovych


People also ask

What is a rotating file handler?

Defines a handler which writes to a file, rotating the log after a time period derived from the given suffix string or after the size of the file grows beyond a certain point and keeping a fixed number of backups.

What is the correct file handler syntax if mode is not specified?

If mode is not specified, 'a' is used. If encoding is not None , it is used to open the file with that encoding.

What is a logger handler?

Log handlers write log record objects to output devices like log files, sockets, and notification mechanisms. Loggers can have zero or more attached handlers. All objects that are logged to the logger are passed to the attached handlers, if handlers are supplied. You can configure handlers with a level.


3 Answers

Like you can see in TimedRotatingFileHandler documentation, your log filename should be the same to get properly the rotating system.

In your case, because YOU are yourself appending dateTime information, the log filename is different each time, and so you observe the result.

So, in your source code, you just need to adapt the log filename:

handler = logging.handlers.TimedRotatingFileHandler(
    "logs/MyLog", 
    when="M", 
    backupCount=2)

If you want to challenge it, you can change the when to "S" (seconds), and check the rotation is OK.

For instance, it will produce such files automagically:

> MyLog
> MyLog.2019-07-08_11-36-53
> MyLog.2019-07-08_11-36-58

Don't hesitate if you need additional information.

like image 152
Bsquare ℬℬ Avatar answered Oct 12 '22 09:10

Bsquare ℬℬ


You can't use TimedRotatingFileHandler, as designed, for your use case. The handler expects the "current" log file name to remain stable and defines rotating as moving existing logging files to a backup by renaming. These are the backups that are kept or deleted. The rotation backups are created from the base filename plus a suffix with the rotation timestamp. So the implementation distinguishes between the log file (stored in baseFilename) and rotation files (generated in the doRotate() method. Note that backups are only deleted when rotation takes place, so after the handler has been in use for at least one full interval.

You instead want the base filename itself to carry the time information, and so are varying the log file name itself. There are no 'backups' in this scenario, you simply open a new file at rotation moments. Moreover, you appear to be running short-lived Python code, so you want older files removed immediately, not just when explicitly rotating, which may never be reached.

This is why TimedRotatingFileHandler won't delete any files, because *it never gets to create backup files. No backups means there is no backups to remove. To rotate files, current implementation of the handler expects to be in charge of filename generation, and can't be expected to know about filenames it would not itself generate. When you configure it with the "M" per-minute rotation frequency, it is configured to rotate files to backup files with the pattern {baseFileame}.{now:%Y-%m-%d_%H_%M}, and so will only ever delete rotated backup files that match that pattern. See the documentation:

The system will save old log files by appending extensions to the filename. The extensions are date-and-time based, using the strftime format %Y-%m-%d_%H-%M-%S or a leading portion thereof, depending on the rollover interval.

Instead, what you want is a base filename that itself carries the timestamp, and that on opening a new log file with a different name that older log files (not backup files) are deleted. For this you'd have to create a custom handler.

Luckily, the class hierarchy is specifically designed for easy customisation. You can subclass BaseRotatingHandler here, and provide your own deletion logic:

import os
import time
from itertools import islice
from logging.handlers import BaseRotatingHandler, TimedRotatingFileHandler

# rotation intervals in seconds
_intervals = {
    "S": 1,
    "M": 60,
    "H": 60 * 60,
    "D": 60 * 60 * 24,
    "MIDNIGHT": 60 * 60 * 24,
    "W": 60 * 60 * 24 * 7,
}

class TimedPatternFileHandler(BaseRotatingHandler):
    """File handler that uses the current time in the log filename.

    The time is quantisized to a configured interval. See
    TimedRotatingFileHandler for the meaning of the when, interval, utc and
    atTime arguments.

    If backupCount is non-zero, then older filenames that match the base
    filename are deleted to only leave the backupCount most recent copies,
    whenever opening a new log file with a different name.

    """

    def __init__(
        self,
        filenamePattern,
        when="h",
        interval=1,
        backupCount=0,
        encoding=None,
        delay=False,
        utc=False,
        atTime=None,
    ):
        self.when = when.upper()
        self.backupCount = backupCount
        self.utc = utc
        self.atTime = atTime
        try:
            key = "W" if self.when.startswith("W") else self.when
            self.interval = _intervals[key]
        except KeyError:
            raise ValueError(
                f"Invalid rollover interval specified: {self.when}"
            ) from None
        if self.when.startswith("W"):
            if len(self.when) != 2:
                raise ValueError(
                    "You must specify a day for weekly rollover from 0 to 6 "
                    f"(0 is Monday): {self.when}"
                )
            if not "0" <= self.when[1] <= "6":
                raise ValueError(
                    f"Invalid day specified for weekly rollover: {self.when}"
                )
            self.dayOfWeek = int(self.when[1])

        self.interval = self.interval * interval
        self.pattern = os.path.abspath(os.fspath(filenamePattern))

        # determine best time to base our rollover times on
        # prefer the creation time of the most recently created log file.
        t = now = time.time()
        entry = next(self._matching_files(), None)
        if entry is not None:
            t = entry.stat().st_ctime
            while t + self.interval < now:
                t += self.interval

        self.rolloverAt = self.computeRollover(t)

        # delete older files on startup and not delaying
        if not delay and backupCount > 0:
            keep = backupCount
            if os.path.exists(self.baseFilename):
                keep += 1
                delete = islice(self._matching_files(), keep, None)
                for entry in delete:
                    os.remove(entry.path)

        # Will set self.baseFilename indirectly, and then may use
        # self.baseFilename to open. So by this point self.rolloverAt and
        # self.interval must be known.
        super().__init__(filenamePattern, "a", encoding, delay)

    @property
    def baseFilename(self):
        """Generate the 'current' filename to open"""
        # use the start of *this* interval, not the next
        t = self.rolloverAt - self.interval
        if self.utc:
            time_tuple = time.gmtime(t)
        else:
            time_tuple = time.localtime(t)
            dst = time.localtime(self.rolloverAt)[-1]
            if dst != time_tuple[-1] and self.interval > 3600:
                # DST switches between t and self.rolloverAt, adjust
                addend = 3600 if dst else -3600
                time_tuple = time.localtime(t + addend)
        return time.strftime(self.pattern, time_tuple)

    @baseFilename.setter
    def baseFilename(self, _):
        # assigned to by FileHandler, just ignore this as we use self.pattern
        # instead
        pass

    def _matching_files(self):
        """Generate DirEntry entries that match the filename pattern.

        The files are ordered by their last modification time, most recent
        files first.

        """
        matches = []
        pattern = self.pattern
        for entry in os.scandir(os.path.dirname(pattern)):
            if not entry.is_file():
                continue
            try:
                time.strptime(entry.path, pattern)
                matches.append(entry)
            except ValueError:
                continue
        matches.sort(key=lambda e: e.stat().st_mtime, reverse=True)
        return iter(matches)

    def doRollover(self):
        """Do a roll-over. This basically needs to open a new generated filename.
        """
        if self.stream:
            self.stream.close()
            self.stream = None

        if self.backupCount > 0:
            delete = islice(self._matching_files(), self.backupCount, None)
            for entry in delete:
                os.remove(entry.path)

        now = int(time.time())
        rollover = self.computeRollover(now)
        while rollover <= now:
            rollover += self.interval
        if not self.utc:
            # If DST changes and midnight or weekly rollover, adjust for this.
            if self.when == "MIDNIGHT" or self.when.startswith("W"):
                dst = time.localtime(now)[-1]
                if dst != time.localtime(rollover)[-1]:
                    rollover += 3600 if dst else -3600
        self.rolloverAt = rollover

        if not self.delay:
            self.stream = self._open()

    # borrow *some* TimedRotatingFileHandler methods
    computeRollover = TimedRotatingFileHandler.computeRollover
    shouldRollover = TimedRotatingFileHandler.shouldRollover

Use this with time.strftime() placeholders in the log filename, and those will be filled in for you:

handler = TimedPatternFileHandler("logs/%H-%M.log", when="M", backupCount=2)

Note that this cleans up older files when you create the instance.

like image 4
Martijn Pieters Avatar answered Oct 12 '22 11:10

Martijn Pieters


I solved the issue, the problem in my case was exactly in the way I named the log file. Instead of using "example.log" I've tried to use "example" without the extension, like that:

logHandler = handlers.TimedRotatingFileHandler('example', when='M', interval=1, backupCount=2)

In the handlers.py file of the logging lib, specifically in getFilesToDelete() method I've noticed that tiny comment which says exactly what not to do:

# See bpo-44753: Don't use the extension when computing the prefix.

Though the issue seems to be already solved, this is the answer to help the same people as me, searching for it for their case.

like image 1
martyrobbins Avatar answered Oct 12 '22 11:10

martyrobbins