Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tail -f log on server, process data, then serve to client via twisted

Tags:

python

twisted

Goal: Show data from server in wxPython GUI on client

Newcomer to Twisted. I have a wxPython GUI running on a Windows 7 client, and I have a program running on an Ubuntu server that produces a log. My current attempt is to tail -f the log, pipe the output to a twisted server, then serve any data that meets my regex conditions to the client. I already have a tunnel open, so I don't need to complicate things with SSH. I've gotten the following block of code running, but it only serves the first line in the input. I know I need to keep checking the input for a newline and then writing it to the transport, but I'm not sure how to do that without breaking the connection.

I haven't been able to find enough information to patch a full solution together. I have also tried various other methods using sockets and file IO, but I think Twisted seems to be a good tool for this issue. Am I on the right track? Any recommendations appreciated. Thanks

#! /usr/bin/python

import optparse, os, sys

from twisted.internet.protocol import ServerFactory, Protocol

def parse_args():
    usage = """usage: %prog [options]
"""

    parser = optparse.OptionParser(usage)

    help = "The port to listen on. Default to a random available port."
    parser.add_option('--port', type='int', help=help)

    help = "The interface to listen on. Default is localhost."
    parser.add_option('--iface', help=help, default='localhost')

    options =parser.parse_args()

    return options#, log_file

class LogProtocol(Protocol):
    def connectionMade(self):
        for line in self.factory.log:
            self.transport.write(line)

class LogFactory(ServerFactory):
    protocol = LogProtocol

    def __init__(self,log):
        self.log = log

def main():
    log = sys.stdin.readline()
    options, log_file = parse_args()

    factory = LogFactory(log)

    from twisted.internet import reactor

    port = reactor.listenTCP(options.port or 0, factory,
                             interface=options.iface)

    print 'Serving %s on %s.' % (log_file, port.getHost())

    reactor.run()


if __name__ == '__main__':
    main()

To answer the first comment, I have also tried to just read the log from within Python, program hangs. Code follows:

#! /usr/bin/python

import optparse, os, sys, time
from twisted.internet.protocol import ServerFactory, Protocol

def parse_args():
    usage = """ usage: %prog [options]"""

    parser = optparse.OptionParser(usage)

    help = "The port to listen on. Default to a random available port"
    parser.add_option('--port', type='int', help=help, dest="port")

    help = "The logfile to tail and write"
    parser.add_option('--file', help=help, default='log/testgen01.log',dest="logfile")

    options = parser.parse_args()
    return options

class LogProtocol(Protocol):
    def connectionMade(self):
        for line in self.follow():
            self.transport.write(line)
        self.transport.loseConnection()

    def follow(self):
        while True:
            line = self.factory.log.readline()
            if not line:
                time.sleep(0.1)
                continue
            yield line

class LogFactory(ServerFactory):
    protocol = LogProtocol

    def __init__(self,log):
        self.log = log

def main():
    options, log_file = parse_args()
    log = open(options.logfile)
    factory = LogFactory(log)

    from twisted.internet import reactor

    port = reactor.listenTCP(options.port or 0, factory)    #,interface=options.iface)

    print 'Serving %s on %s.' % (options.logfile, port.getHost())

    reactor.run()


if __name__ == '__main__':
    main()
like image 288
jsucsy Avatar asked Nov 16 '11 19:11

jsucsy


People also ask

What does tail latency mean?

Tail latency, also known as high-percentile latency, refers to high latencies that clients see fairly infrequently. Things like: "my service mostly responds in around 10ms, but sometimes takes around 100ms".

What causes long tail latency?

Other causes of long tail latency include: Shared resources – If machines are shared by different applications all vying for the same shared resources (e.g. memory or network bandwidth, processor caches, CPU cores) within the same application, different requests might end up competing for those same resources.

What is tail response?

Tail latency is the small percentage of response times from a system, out of all of responses to the input/output (I/O) requests it serves, that take the longest in comparison to the bulk of its response times.


1 Answers

You've got a few different easily separated goals you're trying to achieve here. First, I'll talk about watching the log file.

Your generator has a couple problems. One of them is big - it calls time.sleep(0.1). The sleep function blocks for the amount of time passed to it. While it is blocking, the thread which called it can't do anything else (that's roughly what "blocking" means, after all). You're iterating over the generator in the same thread as LogProtocol.connectionMade is called in (since connectionMade calls follow). LogProtocol.connectionMade is called in the same thread as the Twisted reactor is running, because Twisted is roughly single threaded.

So, you're blocking the reactor with the sleep calls. As long as sleep is blocking the reactor, the reactor can't do anything - like send bytes over sockets. Blocking is transitive, by the way. So LogProtocol.connectionMade is an even bigger problem: it iterates indefinitely, sleeping and reading. So it blocks the reactor indefinitely.

You need to read lines from the file without blocking. You can do this by polling - which is effectively the approach you're taking now - but avoiding the sleep call. Use reactor.callLater to schedule future reads from the file:

def follow(fObj):
    line = fObj.readline()
    reactor.callLater(0.1, follow, fObj)

follow(open(filename))

You can also let LoopingCall deal with the part that makes this a loop that runs forever:

def follow(fObj):
    line = fObj.readline()

from twisted.internet.task import LoopingCall

loop = LoopingCall(follow, open(filename))
loop.start(0.1)

Either of these will let you read new lines from the file over time without blocking the reactor. Of course, they both just drop the line on the floor after they read it. This leads me to the second problem...

You need to react to the appearance of a new line in the file. Presumably you want to write it out to your connection. This isn't too hard: "reacting" is pretty easy, it usually just means calling a function or a method. In this case, it's easiest to have the LogProtocol set up the log following and supply a callback object to handle lines when they appear. Consider this slight adjustment to the follow function from above:

def follow(fObj, gotLine):
    line = fObj.readline()
    if line:
        gotLine(line)

def printLine(line):
    print line

loop = LoopingCall(follow, open(filename), printLine)
loop.start(0.1)

Now you can non-blockingly poll a log file for new lines and learn when one has actually shown up. This is simple to integrate with LogProtocol...

class LogProtocol(Protocol):
    def connectionMade(self):
        self.loop = LoopingCall(follow, open(filename), self._sendLogLine)
        self.loop.start()

    def _sendLogLine(self, line):
        self.transport.write(line)

One last detail is that you probably want to stop watching the file when the connection is lost:

    def connectionLost(self, reason):
        self.loop.stop()

So, this solution avoids blocking by using LoopingCall instead of time.sleep and pushes lines to the protocol when they're found using simple method calls.

like image 146
Jean-Paul Calderone Avatar answered Sep 29 '22 14:09

Jean-Paul Calderone