Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ensuring Python logging in multiple threads is thread-safe

I have a log.py module, that is used in at least two other modules (server.py and device.py).

It has these globals:

fileLogger = logging.getLogger() fileLogger.setLevel(logging.DEBUG) consoleLogger = logging.getLogger() consoleLogger.setLevel(logging.DEBUG)  file_logging_level_switch = {     'debug':    fileLogger.debug,     'info':     fileLogger.info,     'warning':  fileLogger.warning,     'error':    fileLogger.error,     'critical': fileLogger.critical }  console_logging_level_switch = {     'debug':    consoleLogger.debug,     'info':     consoleLogger.info,     'warning':  consoleLogger.warning,     'error':    consoleLogger.error,     'critical': consoleLogger.critical } 

It has two functions:

def LoggingInit( logPath, logFile, html=True ):     global fileLogger     global consoleLogger      logFormatStr = "[%(asctime)s %(threadName)s, %(levelname)s] %(message)s"     consoleFormatStr = "[%(threadName)s, %(levelname)s] %(message)s"      if html:         logFormatStr = "<p>" + logFormatStr + "</p>"      # File Handler for log file     logFormatter = logging.Formatter(logFormatStr)     fileHandler = logging.FileHandler(          "{0}{1}.html".format( logPath, logFile ))     fileHandler.setFormatter( logFormatter )     fileLogger.addHandler( fileHandler )      # Stream Handler for stdout, stderr     consoleFormatter = logging.Formatter(consoleFormatStr)     consoleHandler = logging.StreamHandler()      consoleHandler.setFormatter( consoleFormatter )     consoleLogger.addHandler( consoleHandler ) 

And:

def WriteLog( string, print_screen=True, remove_newlines=True,          level='debug' ):      if remove_newlines:         string = string.replace('\r', '').replace('\n', ' ')      if print_screen:         console_logging_level_switch[level](string)      file_logging_level_switch[level](string) 

I call LoggingInit from server.py, which initializes the file and console loggers. I then call WriteLog from all over the place, so multiple threads are accessing fileLogger and consoleLogger.

Do I need any further protection for my log file? The documentation states that thread locks are handled by the handler.

like image 818
nckturner Avatar asked Jun 05 '13 00:06

nckturner


People also ask

Is logging in Python thread-safe?

The logging module is thread-safe; it handles the locking for you.

Is Python good for multi-threading?

Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.

Does threading in Python use multiple cores?

Only a single thread can acquire that lock at a time, which means the interpreter ultimately runs the instructions serially. This design makes memory management thread-safe, but as a consequence, it can't utilize multiple CPU cores at all.


1 Answers

The good news is that you don't need to do anything extra for thread safety, and you either need nothing extra or something almost trivial for clean shutdown. I'll get to the details later.

The bad news is that your code has a serious problem even before you get to that point: fileLogger and consoleLogger are the same object. From the documentation for getLogger():

Return a logger with the specified name or, if no name is specified, return a logger which is the root logger of the hierarchy.

So, you're getting the root logger and storing it as fileLogger, and then you're getting the root logger and storing it as consoleLogger. So, in LoggingInit, you initialize fileLogger, then re-initialize the same object under a different name with different values.

You can add multiple handlers to the same logger—and, since the only initialization you actually do for each is addHandler, your code will sort of work as intended, but only by accident. And only sort of. You will get two copies of each message in both logs if you pass print_screen=True, and you will get copies in the console even if you pass print_screen=False.

There's actually no reason for global variables at all; the whole point of getLogger() is that you can call it every time you need it and get the global root logger, so you don't need to store it anywhere.


A more minor problem is that you're not escaping the text you insert into HTML. At some point you're going to try to log the string "a < b" and end up in trouble.

Less seriously, a sequence of <p> tags that isn't inside a <body> inside an <html> is not a valid HTML document. But plenty of viewers will take care of that automatically, or you can post-process your logs trivially before displaying them. But if you really want this to be correct, you need to subclass FileHandler and have your __init__ add a header if given an empty file and remove a footer if present, then have your close add a footer.


Getting back to your actual question:

You do not need any additional locking. If a handler correctly implements createLock, acquire, and release (and it's called on a platform with threads), the logging machinery will automatically make sure to acquire the lock when needed to make sure each message is logged atomically.

As far as I know, the documentation doesn't directly say that StreamHandler and FileHandler implement these methods, it does strongly imply it (the text you mentioned in the question says "The logging module is intended to be thread-safe without any special work needing to be done by its clients", etc.). And you can look at the source for your implementation (e.g., CPython 3.3) and see that they both inherit correctly-implemented methods from logging.Handler.


Likewise, if a handler correctly implements flush and close, the logging machinery will make sure it's finalized correctly during normal shutdown.

Here, the documentation does explain what StreamHandler.flush(), FileHandler.flush(), and FileHandler.close(). They're mostly what you'd expect, except that StreamHandler.close() is a no-op, meaning it's possible that final log messages to the console may get lost. From the docs:

Note that the close() method is inherited from Handler and so does no output, so an explicit flush() call may be needed at times.

If this matters to you, and you want to fix it, you need to do something like this:

class ClosingStreamHandler(logging.StreamHandler):     def close(self):         self.flush()         super().close() 

And then use ClosingStreamHandler() instead of StreamHandler().

FileHandler has no such problem.


The normal way to send logs to two places is to just use the root logger with two handlers, each with their own formatter.

Also, even if you do want two loggers, you don't need the separate console_logging_level_switch and file_logging_level_switch maps; calling Logger.debug(msg) is exactly the same thing as calling Logger.log(DEBUG, msg). You'll still need some way to map your custom level names debug, etc. to the standard names DEBUG, etc., but you can just do one lookup, instead of doing it once per logger (plus, if your names are just the standard names with different cast, you can cheat).

This is all described pretty well in the `Multiple handlers and formatters section, and the rest of the logging cookbook.

The only problem with the standard way of doing this is that you can't easily turn off console logging on a message-by-message basis. That's because it's not a normal thing to do. Usually, you just log by levels, and set the log level higher on the file log.

But, if you want more control, you can use filters. For example, give your FileHandler a filter that accepts everything, and your ConsoleHandler a filter that requires something starting with console, then use the filter 'console' if print_screen else ''. That reduces WriteLog to almost a one-liner.

You still need the extra two lines to remove newlines—but you can even do that in the filter, or via an adapter, if you want. (Again, see the cookbook.) And then WriteLog really is a one-liner.

like image 132
abarnert Avatar answered Oct 18 '22 03:10

abarnert