Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logging module for print statements: Duplicate log entries

UPDATE: Scroll down to EDIT SECTION (4) for an almost totally working version. Also removed EDIT SECTION (1) because this post is too long otherwise, and that is probably the least helpful section. The link that was originally in EDIT SECTION (1) is below.

How do I duplicate sys.stdout to a log file in python?

Long time searcher here, but first time asking a question.

EXPLANATION:

I need to have prints redirect to the log because I am incorporating C code through system calls that uses print statements for logging messages. There is also piles of old Python code written by my coworkers that gets called upon, which also uses print statements for debugging.

Ultimately, I want this to be able to handle logging.info('message') for my updated code, but also be able to redirect print statements through the built-in logging module for code I either can't change or simply haven't gotten around to updating yet.

Below is some example code I came up with to concisely demonstrate my problems.

QUESTIONS:

  1. I use the setup below for my logs, but every time I make a print, I get duplicate entries in my log (and a blank line). Can anyone explain why this happens?
  2. It would be nice to figure out a better setup for the logging, so that the format statement includes the proper module name when I redirect a print command through logging.
  3. My use of this Tee(object) class seems to break things occasionally. See supporting information section below.

MY CODE:

Edit: Originally setuplog.Tee.init contained if os.path.exists(LOGNAME): os.remove(LOGNAME). This has been removed and put into base.py.

setuplog.py:

#!/usr/bin/python
import sys
import os
import logging
import logging.config

LOGNAME = 'log.txt'

CONFIG = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': {
        'simple': {
            'format': '%(module)s:%(thread)d: %(message)s'
            },
        },
    'handlers': {
        'console': {
            'level': 'NOTSET',
            'class': 'logging.StreamHandler',
            'formatter': 'simple'
            },
        'file': {
            'level': 'NOTSET',
            'class': 'logging.FileHandler',
            'formatter': 'simple',
            'filename': LOGNAME
            },
        },
    'root': {
        'level': 'NOTSET',
        'handlers': ['console', 'file']
        },
    }


class Tee(object):
    def __init__(self):
        logging.config.dictConfig(CONFIG)
        self.logger = logging.getLogger(__name__)
        self.stdout = sys.stdout
        sys.stdout = self
    def __del__(self):
        sys.stdout = self.stdout
    def write(self, data):
        self.logger.info(data)

base.py:

#!/usr/bin/python
import sys
import os
import logging
# My modules.
import setuplog
#import aux2


LOGNAME = 'log.txt'
if os.path.exists(LOGNAME):
    os.remove(LOGNAME)

not_sure_what_to_call_this = setuplog.Tee()

print '1 base'
logging.info('2 base')
print '3 base'
os.system('./aux1.py')
logging.info('4 base')
#aux2.aux2Function()
#logging.info('5 base')

aux1.py:

#!/usr/bin/python
import sys
import os
import logging
import setuplog

not_sure_what_to_call_this = setuplog.Tee()

def main():
    print '1 aux1'
    logging.info('2 aux1')
    print '3 aux1'
    logging.info('4 aux1')


if __name__ == '__main__':
    main()

aux2.py:

#!/usr/bin/python
import sys
import os
import logging
import setuplog

not_sure_what_to_call_this = setuplog.Tee()

def aux2Function():
    print '1 aux2'
    logging.info('2 aux2')
    print '3 aux2'

I then run from the shell "./base.py" and produce the following output (in console and in log.txt):

setuplog:139833296844608: 1 aux1
setuplog:139833296844608:

aux1:139833296844608: 2 aux1
setuplog:139833296844608: 3 aux1
setuplog:139833296844608:

aux1:139833296844608: 4 aux1

As you can see, entries that generated with print are repeated (QUESTION 1). Also, I need to come up with a better convention for displaying the module name (QUESTION 2).

SUPPORTING INFORMATION FOR QUESTION 3:

From base.py, if I uncomment "import aux2", "aux2.aux2Function()", and "logging.info('5 base')", here is the new output (directly from my console as this is the only place the Python errors go to):

base:140425995155264: 2 base
setuplog:140360687101760: 1 aux1
setuplog:140360687101760:

aux1:140360687101760: 2 aux1
setuplog:140360687101760: 3 aux1
setuplog:140360687101760:

aux1:140360687101760: 4 aux1
base:140425995155264: 4 base
aux2:140425995155264: 2 aux2
base:140425995155264: 5 base
Exception AttributeError: "'NoneType' object has no attribute 'stdout'" in <bound method Tee.__del__ of <setuplog.Tee object at 0x7fb772f58f10>> ignored

EDIT SECTION (2):

I have been playing around, and this sort of works. Here are (again) updated versions of the example code.

The reason why it "sort of" works is:

  1. I think exceptions should be avoided at all costs, and this makes use of one.
  2. The log output is now somewhat manual. Let me explain. The %(name)s comes out as expected, but I have to set this manually. I would instead prefer some sort of descriptor that automatically picks out the name of the file, or something similar (picking out functions as a bonus?). %(module)s always display "setuplog" (correctly) for print statements, even if I want the module reported to be where the print statement came from, not the module where my class to direct print statements into the logging module is.

setuplog.py:

#!/usr/bin/python
import sys
import os
import logging
import logging.config

def startLog(name):
    logname = 'log.txt'
    config = {
        'version': 1,
        'disable_existing_loggers': True,
        'formatters': {
            'simple': {
                'format': '%(name)s:%(module)s:%(thread)s: %(message)s'
                },
            },
        'handlers': {
            'console': {
                'level': 'NOTSET',
                'class': 'logging.StreamHandler',
                'formatter': 'simple'
                },
            'file': {
                'level': 'NOTSET',
                'class': 'logging.FileHandler',
                'formatter': 'simple',
                'filename': logname
                },
            },
        'root': {
            'level': 'NOTSET',
            'handlers': ['console', 'file'],
            },
        }
    logging.config.dictConfig(config)
    return logging.getLogger(name)

class Tee():
    def __init__(self, logger):
        self.stdout = sys.stdout
        self.data = ''
        self.logger = logger
        sys.stdout = self
    def __del__(self):
        try:
            sys.stdout = self.stdout
        except AttributeError:
            pass
    def write(self, data):
        self.data += data
        self.data = str(self.data)
        if '\x0a' in self.data or '\x0d' in self.data:
            self.data = self.data.rstrip('\x0a\x0d')
            self.logger.info(self.data)
            self.data = ''

base.py:

#!/usr/bin/python
import sys
import os
import logging
# My modules.
import setuplog
import aux2

LOGNAME = 'log.txt'
if os.path.exists(LOGNAME):
    os.remove(LOGNAME)
logger = setuplog.startLog('base')
setuplog.Tee(logger)

print '1 base'
logger.info('2 base')
print '3 base'
os.system('./aux1.py')
logger.info('4 base')
aux2.aux2Function()
logger.info('5 base')

aux1.py:

#!/usr/bin/python
import sys
import os
import logging
import setuplog


def main():
    logger = setuplog.startLog('aux1')
    setuplog.Tee(logger)
    print '1 aux1'
    logger.info('2 aux1')
    print '3 aux1'
    logger.info('4 aux1')


if __name__ == '__main__':
    main()

aux2.py:

#!/usr/bin/python
import sys
import os
import logging
import setuplog


def aux2Function():
    logger = setuplog.startLog('aux2')
    setuplog.Tee(logger)
    print '1 aux2'
    logger.info('2 aux2')
    print '3 aux2'

And the output:

base:setuplog:139712687740736: 1 base
base:base:139712687740736: 2 base
base:setuplog:139712687740736: 3 base
aux1:setuplog:140408798721856: 1 aux1
aux1:aux1:140408798721856: 2 aux1
aux1:setuplog:140408798721856: 3 aux1
aux1:aux1:140408798721856: 4 aux1
base:base:139712687740736: 4 base
aux2:setuplog:139712687740736: 1 aux2
aux2:aux2:139712687740736: 2 aux2
aux2:setuplog:139712687740736: 3 aux2

EDIT SECTION (3):

Thanks to wonderful reply on reddit (http://www.reddit.com/r/learnpython/comments/1kaduo/python_logging_module_for_print_statements/cbn2lef), I was able to develop a work around for the AttributeError. Rather than using an exception, I converted the class into a singleton.

Here is code for the updated Tee class:

class Tee(object):
    _instance = None
    def __init__(self, logger):
        self.stdout = sys.stdout
        self.data = ''
        self.logger = logger
        sys.stdout = self
    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
        return cls._instance
    def __del__(self):
        sys.stdout = self.stdout
    def write(self, data):
        self.data += data
        self.data = str(self.data)
        if '\x0a' in self.data or '\x0d' in self.data:
            self.data = self.data.rstrip('\x0a\x0d')
            self.logger.info(self.data)
            self.data = ''

EDIT SECTION (4):

This almost completely works! It works well enough for me to implement it. The only problem now is getting the format of the output to be more helpful. For example, %(filename)s is setuplog.py for all rerouted print statements. It would be much more beneficial if %(filename)s was the file from which the print statement originated. Any ideas?

Also, I had to ditch the dictionary method. The only way I could get this to all work was by setting up the logger through Python code.

Last point, look into aux3.py. If os.system is used rather than subprocess, the order of the logging gets messed up. Does anyone know a work around to still use os.system and get the proper order (so I don't have to change every last os.system to a subprocess.Popen)?

setuplog.py (you can ignore the functions startDictLog and startFileLog because they don't work. However, startCodeLog does!):

#!/usr/bin/python
import sys
import os
import logging
import logging.config

def startLog(name, propagate):
    '''The only point of this function was to enable me to quickly
    and easily switch how I wanted to configure the logging. So far,
    I have only been able to get the last way working
    (startCodeLog).'''
    #return startDictLog(name)
    #return startFileLog(name)
    return startCodeLog(name, propagate)

def startDictLog(name):
    '''Configure logging usinga dictionary.'''
    LOGNAME = 'loop.log'
    DEBUGNAME = 'debug.log'
    config = {
        'version': 1,
        'disable_existing_loggers': True,
        'formatters': {
            'bare': {
                # Added the BARE to distinguish between normal prints
                # and those that get rerouted. In use, I would format it
                # such that only the message is printed.
                'format': 'BARE: %(message)s'
                },
            'simple': {
                'format': '%(module)s-%(name)s: %(message)s'
                },
            'time': {
                'format': '%(asctime)s-%(filename)s-%(module)s-%(name)s: %(message)s',
                'datefmt': '%H:%M:%S'
                },
            },
        'handlers': {
            'console': {
                'level': 'NOTSET',
                'class': 'logging.StreamHandler',
                'formatter': 'bare'
                },
            'normal': {
                'level': 'INFO',
                'class': 'logging.FileHandler',
                'formatter': 'simple',
                'filename': LOGNAME
                },
            'debug': {
                'level': 'NOTSET',
                'class': 'logging.FileHandler',
                'formatter': 'time',
                'filename': DEBUGNAME
                },
            },
        'root': {
            'level': 'NOTSET',
            'handlers': ['console', 'normal', 'debug'],
            },
        }
    logging.config.dictConfig(config)
    return logging.getLogger(name)

def startFileLog(name):
    '''Configure logging using a configuration file.'''
    CONFIGFILE = 'logging.conf'
    logging.config.fileConfig(CONFIGFILE)
    return logging.getLogger(name)

def startCodeLog(name, propagate):
    '''Configure logging using this code.'''
    LOGFILE = 'loop.log'
    DEBUGFILE = 'debug.log'
    _logger = logging.getLogger(name)
    _logger.setLevel(logging.NOTSET)
    if propagate in [False, 'n', 'no', 0]:
        _logger.propagate = False
    _console = logging.StreamHandler()
    _normal = logging.FileHandler(LOGFILE)
    _debug = logging.FileHandler(DEBUGFILE)

    _bare = logging.Formatter('BARE: %(message)s')
    _simple = logging.Formatter('%(module)s-%(name)s: %(message)s')
    _time = logging.Formatter('%(asctime)s-%(module)s-%(name)s: %(message)s',
                                       datefmt = '%H:%M:%S')
    # I added _complex only here, to the working way of setting up the configuration,
    # in hopes that this data may help someone figure out how to get the printout
    # to be more useful. For example, it's not helpful that the filename is
    # setuplog.py for every print statement. It would be more beneficial to somehow
    # get the filename of where the print statement originated.
    _complex = logging.Formatter('%(filename)s-%(funcName)s-%(name)s: %(message)s')

    # Normally this is set to _bare to imitate the output of the old version of the
    # scripts I am updating, but for our purposes, _complex is more convenient.
    _console.setLevel(logging.NOTSET)
    _console.setFormatter(_complex)

    # This imitates the format of the logs from versions before I applied this update.
    _normal.setLevel(logging.INFO)
    _normal.setFormatter(_bare)

    # This is a new log file I created to help me debug other aspects of the scipts
    # I am updating.
    _debug.setLevel(logging.DEBUG)
    _debug.setFormatter(_time)

    _logger.addHandler(_console)
    _logger.addHandler(_normal)
    _logger.addHandler(_debug)
    return _logger

class Tee(object):
    '''Creates a singleton class that tees print statements to the
    handlers above.'''
    _instance = None
    def __init__(self, logger):
        self.stdout = sys.stdout
        self.logger = logger
        sys.stdout = self
        self._buf = ''
        # Part of old method in the write function.
        #self.data = ''
    def __new__(cls, *args, **kwargs):
        '''This is the singleton implementation. This avoids errors with
        multiple instances trying to access the same standard output.'''
        if not cls._instance:
            cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
        return cls._instance
    def write(self, data):
        # This method doesn't work with how I had to implement subprocess.
        #self.data = data.rstrip('\r\n')
        #if self.data:
        #    self.logger.info(self.data)

        # Also doesn't seem to work with my subprocess implementation.
        #self.data += data
        #self.data = str(self.data)
        #if '\x0a' in self.data or '\x0d' in self.data:
        #    self.data = self.data.rstrip('\x0a\x0d')
        #    self.logger.info(self.data)
        #    self.data = ''

        # Only way I could get it working with my subprocess implementation.
        self._buf = ''.join([self._buf, data])
        while '\n' in self._buf:
            line, _, tail = self._buf.partition('\n')
            if line.strip():
                self.logger.info(line)
            self._buf = tail

base.py:

#!/usr/bin/python
import sys
import os
import logging
import subprocess
# My modules below.
import setuplog
import aux2

# It is assumed that this script will be executed via the command line,
# and each time we run it, we want new log files.
LOGNAME = 'loop.log'
DEBUGNAME ='debug.log'
if os.path.exists(LOGNAME):
    os.remove(LOGNAME)
if os.path.exists(DEBUGNAME):
    os.remove(DEBUGNAME)

# It seems more convenient to store the logging configuration elsewhere,
# hence my module setuplog.
logger = setuplog.startLog('', True)
logger = setuplog.startLog('base', False)
# This initializes sys.stdout being redirected through logging. Can anyone
# explain how calling this class achieves this? I am a bit fuzzy on my
# understanding here.
setuplog.Tee(logger)

# Test to see how it works in this module.
print '1 base'
logger.info('2 base')
print '3 base'

# Below shows how to get logging to work with scripts that can
# not be modified. In my case, I have C code that I don't want to modify,
# but I still need to log it's output.

# I will have to go through the old code and change all os.system calls
# to instead utilize subprocess. Too bad because this will take some time
# and be "busy work", but at least it works. What can ya do, os.system is
# depreciated anyway. The positive side is that only the root application
# needs to change its system calls to use subprocess. The scipts that it
# calls upon can remain untouched.

aux1_py_path = '"%s"' % os.path.join(os.path.dirname(__file__), 'aux1.py')
#os.system(aux1_py_path) # Example to show how os.system doesn't work.
print 'aux1_py_path:', aux1_py_path
sys_call = subprocess.Popen(aux1_py_path, shell=True, stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE)
sys_stdout, sys_stderr = sys_call.communicate()
print sys_stdout
print sys_stderr

# This is to ensure that the order of the logging events is correct. With an
# old method I used (not with logging, simply redirecting stdout into an
# opened file), the order would often get messed up.
logger.info('4 base')
# This example is to show how to get logging to work with Python scripts I am
# willing to modify, at least partially. See aux2.py. It is simply a print
# statement, followed by a logging statement, followed by another print
# statement. They all output properly with this method.
aux2.aux2Function()
# Again to ensure the order of events is correct.
logger.info('5 base')

aux1.py:

#!/usr/bin/python
import logging
import subprocess
import os

def main():
    '''We expect the print statements to go through, as they are
    being sent to logging. However, these logging statements
    do nothing as no logger has been instantiated. This is the
    behavior we should expect, as this script mimics a script
    that we would not modify, so it would not have logging calls
    anyway.'''
    print '1 aux1'
    logging.info('2 aux1')
    print '3 aux1'
    logging.info('4 aux1')

    # Here, neither option works unless the root of all these calls
    # was made with subprocess and in the script that called the
    # tee class. If both conditions are met, subprocess works as it
    # should. However, os.system returns the print out from the call
    # out of order.

    aux3_py_path = '"%s"' % os.path.join(os.path.dirname(__file__), 'aux3.py')
    #os.system(aux3_py_path)
    sys_call = subprocess.Popen(aux3_py_path, shell=True,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE)
    sys_stdout, sys_stderr = sys_call.communicate()
    print sys_stdout
    print sys_stderr

if __name__ == '__main__':
    main()

aux2.py:

#!/usr/bin/python
import sys
import os
import logging
import setuplog

def aux2Function():
    # The following two lines are not necessary for most functionality.
    # However, if the following two lines are uncommented, then %(name)s
    # in the format descriptor of logging will correctly be 'aux2' rather
    # than base. Both lines are required for this to work.
    logger = setuplog.startLog('aux2', False)
    setuplog.Tee(logger)

    print '1 aux2'
    logging.info('2 aux2')
    print '3 aux2'

aux3.py:

#!/usr/bin/python
import logging

def main():
    '''See __doc__ for aux1.py. Again, we don't expect the logging.info
    to work, but that's okay because theoretically, this is some script
    we can't modify that simply generates output with print or print
    like functions.'''
    print '1 aux3'
    logging.info('2 aux3')
    print '3 aux3'

if __name__ == '__main__':
    main()

logging.conf (doesn't work):

[loggers]
keys=root

[handlers]
keys=console,normal,debug

[formatters]
keys=bare,simple,time

[logger_root]
level=NOTSET
handlers=console,normal,debug

[handler_console]
level=NOTSET
class=StreamHandler
formatter=bare
args=(sys.stdout,)

[handler_normal]
level=INFO
class=FileHandler
formatter=simple
args=('loop.log',)

[handler_debug]
level=DEBUG
class=FileHandler
formatter=time
args=('debug.log',)

[formatter_bare]
format=%(message)s

[formatter_simple]
format=%(module)s-%(name)s: %(message)s

[formatter_time]
format=%(asctime)s-%(filename)s-%(module)s-%(name)s: %(message)s
datefmt=%H:%M:%S
like image 436
Eric Hansen Avatar asked Oct 04 '22 06:10

Eric Hansen


1 Answers

Here's at least a partial answer to your first question. You're getting those blank lines on eachprintbecauseprintstatements in Python 2.x potentially callstdout.write()twice, once with the data from evaluating the expression in the statement, and then again if the optional newline hasn't been suppressed by a trailing comma.

Also, if anINFO level message is sent to a logger set at alevelof'NOTSET', the message will also be echoed on sys.stderrby default according to the logger documentation -- which is why you see the console output even when the Tee is in effect. There are no duplicate log entries that I can see.

To prevent the blanks lines, try using thisTeeclass definition. Note thewrite()method modifications (updated to be a singleton to match your "EDIT SECTION (3)"):

class Tee(object):
    _instance = None
    def __init__(self, logger):
        self.stdout = sys.stdout
        self.logger = logger
        sys.stdout = self
    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super(Tee, cls).__new__(cls, *args, **kwargs)
        return cls._instance
    def __del__(self):
        sys.stdout = self.stdout
    def write(self, data):
        data = data.rstrip('\r\n')
        if data: # anything left?
           self.logger.info(data)

With this and all your other updates in place it seems to work for me (including with the uncommented aux2 stuff. Are there some remaining problems from your point of view? If so, your initially long question has gotten even more so and should be completely cleaned-up leaving just the most recent code and focus on any remaining issue(s).

like image 190
martineau Avatar answered Oct 12 '22 10:10

martineau