I am trying to play with python logging
module but get a bit confused here. Below is a standard script to create a logger
first, and then create and add a file handler
and console handler
to the logger
.
import logging
logger = logging.getLogger('logging_test')
logger.setLevel(logging.DEBUG)
print(len(logger.handlers)) # output: 0
# create file handler which logs even debug messages
fh = logging.FileHandler('/home/Jian/Downloads/spam.log', mode='w')
fh.setLevel(logging.DEBUG)
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch.setFormatter(formatter)
fh.setFormatter(formatter)
# add the handlers to logger
logger.addHandler(ch)
logger.addHandler(fh)
print(len(logger.handlers)) # output: 2
# write some log messages
logger.debug('debug message')
logger.info('info message')
logger.warn('warn message')
logger.error('error message')
logger.critical('critical message')
I run this on a newly started kernel. file handler
works as expected. But in the console output, I've got some sort of duplicated messages:
2015-07-14 10:59:26,942 - logging_test - DEBUG - debug message
DEBUG:logging_test:debug message
2015-07-14 10:59:26,944 - logging_test - INFO - info message
INFO:logging_test:info message
2015-07-14 10:59:26,944 - logging_test - WARNING - warn message
WARNING:logging_test:warn message
2015-07-14 10:59:26,945 - logging_test - ERROR - error message
ERROR:logging_test:error message
2015-07-14 10:59:26,946 - logging_test - CRITICAL - critical message
CRITICAL:logging_test:critical message
I suppose those log messages with time stamps are from the user-defined console handler
, but where are the duplicated messages come from? Can I get rid of them, say, keep only every other line? Any help is appreciated.
IPython has a cell magic, %%capture , which captures the stdout/stderr of a cell. With this magic you can discard these streams or store them in a variable. By default, %%capture discards these streams. This is a simple way to suppress unwanted output.
getLogger(name) is typically executed. The getLogger() function accepts a single argument - the logger's name. It returns a reference to a logger instance with the specified name if provided, or root if not. Multiple calls to getLogger() with the same name will return a reference to the same logger object.
In a Jupyter notebook, click on the down arrow next to the kernel on the right corner. You can click on the Spark UI, Driver Logs, or Kernel Log from the widget. The following image shows the UI options to access the logs. The Spark UI, Driver Log, and Kerner Log open in a separate tab.
The issue has been raised here.
The observation is the following: In normal python, or the IPython console, there is no handler to the root logger installed until the root logger itself is used to issue a log message:
In [1]: import logging
In [2]: logging.getLogger().handlers
Out[2]: []
In [3]: logging.warn('Something happened!')
WARNING:root:Something happened!
In [4]: logging.getLogger().handlers
Out[4]: [<logging.StreamHandler at 0x42acef0>]
However, in the IPython notebook, there is the default stderr root logger installed right away:
In [1]: import logging
In [2]: logging.getLogger().handlers
Out[2]: [<logging.StreamHandler at 0x35eedd8>]
Maybe I'm missing something, but I think that in the notebook, there should no handler be installed automatically, for a number of reasons:
propagate
attribute to False
, so that the log messages are not propagated to the root logger, but this not only disables debug output from going into the notebook, but also more severe messages. Also, it prevents users from actually capturing all log output if they want to.An alternative might be to add a configuration option that specifies a log level for the automatically added stream handler, so that it becomes possible to ignore less severe messages from showing up in the notebook automatically. But this would still make the behaviour different between IPython console and IPython notebook.
The only downside I see with making sure there is no default handler set is that some libraries/notebooks in use might rely on this behaviour and actively work around it, for example by disabling their own handlers if they detect that they are running in the ipython notebook. Such cases would likely break with such a change.
So setting logger.propagate
to False
or using reload(logging)
will prevent duplicate output but depending on will have side effects.
Note that reload
is not available in newer version of python (3.4, maybe earlier). From 3.1 onwards see importlib.reload
When you perform
# create console handler with a higher log level
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
Your are creating another StreamHandler. To solve your problem you should capture the StreamHandler from iPython:
import logging
handlers = logging.getLogger().handlers
for h in handlers:
if isinstance(h, logging.StreamHandler):
handler_console = h
break
If it does not exists you create your own:
if handler_console is None:
handler_console = logging.StreamHandler()
Finally format (set other properties) as desired:
if handler_console is not None:
# first we need to remove to avoid duplication
logging.getLogger().removeHandler(handler_console)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler_console.setFormatter(formatter)
# then add it back
logger.addHandler(handler_console)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With