Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tail multiple logfiles in python

Tags:

python

This is probably a bit of a silly excercise for me, but it raises a bunch of interesting questions. I have a directory of logfiles from my chat client, and I want to be notified using notify-osd every time one of them changes.

The script that I wrote basically uses os.popen to run the linux tail command on every one of the files to get the last line, and then check each line against a dictionary of what the lines were the last time it ran. If the line changed, it used pynotify to send me a notification.

This script actually worked perfectly, except for the fact that it used a huge amount of cpu (probably because it was running tail about 16 times every time the loop ran, on files that were mounted over sshfs.)

It seems like something like this would be a great solution, but I don't see how to implement that for more than one file.

Here is the script that I wrote. Pardon my lack of comments and poor style.

Edit: To clarify, this is all linux on a desktop.

like image 916
keevie Avatar asked Dec 13 '22 13:12

keevie


2 Answers

Not even looking at your source code, there are two ways you could easily do this more efficiently and handle multiple files.

  1. Don't bother running tail unless you have to. Simply os.stat all of the files and record the last modified time. If the last modified time is different, then raise a notification.

  2. Use pyinotify to call out to Linux's inotify facility; this will have the kernel do option 1 for you and call back to you when any files in your directory change. Then translate the callback into your osd notification.

Now, there might be some trickiness depending on how many notifications you want when there are multiple messages and whether you care about missing a notification for a message.

An approach that preserves the use of tail would be to instead use tail -f. Open all of the files with tail -f and then use the select module to have the OS tell you when there's additional input on one of the file descriptors open for tail -f. Your main loop would call select and then iterate over each of the readable descriptors to generate notifications. (You could probably do this without using tail and just calling readline() when it's readable.)

Other areas of improvement in your script:

  • Use os.listdir and native Python filtering (say, using list comprehensions) instead of a popen with a bunch of grep filters.
  • Update the list of buffers to scan periodically instead of only doing it at program boot.
  • Use subprocess.popen instead of os.popen.
like image 56
Emil Sit Avatar answered Dec 28 '22 23:12

Emil Sit


If you're already using the pyinotify module, it's easy to do this in pure Python (i.e. no need to spawn a separate process to tail each file).

Here is an example that is event-driven by inotify, and should use very little cpu. When IN_MODIFY occurs for a given path we read all available data from the file handle and output any complete lines found, buffering the incomplete line until more data is available:

import os
import select
import sys
import pynotify
import pyinotify

class Watcher(pyinotify.ProcessEvent):

    def __init__(self, paths):
        self._manager = pyinotify.WatchManager()
        self._notify = pyinotify.Notifier(self._manager, self)
        self._paths = {}
        for path in paths:
            self._manager.add_watch(path, pyinotify.IN_MODIFY)
            fh = open(path, 'rb')
            fh.seek(0, os.SEEK_END)
            self._paths[os.path.realpath(path)] = [fh, '']

    def run(self):
        while True:
            self._notify.process_events()
            if self._notify.check_events():
                self._notify.read_events()

    def process_default(self, evt):
        path = evt.pathname
        fh, buf = self._paths[path]
        data = fh.read()
        lines = data.split('\n')
        # output previous incomplete line.
        if buf:
            lines[0] = buf + lines[0]
        # only output the last line if it was complete.
        if lines[-1]:
            buf = lines[-1]
        lines.pop()

        # display a notification
        notice = pynotify.Notification('%s changed' % path, '\n'.join(lines))
        notice.show()

        # and output to stdout
        for line in lines:
            sys.stdout.write(path + ': ' + line + '\n')
        sys.stdout.flush()
        self._paths[path][1] = buf

pynotify.init('watcher')
paths = sys.argv[1:]
Watcher(paths).run()

Usage:

% python watcher.py [path1 path2 ... pathN]
like image 36
samplebias Avatar answered Dec 28 '22 23:12

samplebias