When writing Python asyncio programs, often there is an async function that have many invocations running concurrently. I want to add some logging to this function, but the logging output from different invocations will be interleaved, making it hard to follow. My current solution is to somehow create a unique name for each invocation, and log that name each time, like this:
async def make_request(args):
logger = logging.getLogger('myscript.request')
log_name = unique_name()
logger.debug('[%s] making request with args %r', log_name, args)
response = await request(args)
logger.debug('[%s] response: %r', log_name, response)
However, having to put log_name
in every logging call gets tiring pretty quickly. To save those keystrokes, I came up with a different solution, creating a new logger with a unique name for each invocation:
async def make_request(args):
logger = logging.getLogger(f'myscript.request.{unique_name()}')
logger.debug('making request with args %r', args)
response = await request(args)
logger.debug('response: %r', response)
Are there any down sides to this approach? The only thing I can think of is that creating a new logger may be expensive, but is that actually the case? Are there any pitfalls I'm not seeing?
Are there any down sides to [creating a new logger for each coroutine]?
Other than the possible price of creating a logger, another downside is that the logger you create stays associated with the unique name forever and is never destroyed, so you effectively have a memory leak. This is explicitly promised by the documentation:
Multiple calls to
getLogger()
with the same name will always return a reference to the same Logger object.
I'd recommend just biting the bullet and creating a helper with the desired functionality. Building on Brad Solomon's answer, the wrapper might look like this (untested):
import itertools, weakref, logging
logging.basicConfig(format='%(asctime)-15s %(task_name)s %(message)s')
class TaskLogger:
_next_id = itertools.count().__next__
_task_ids = weakref.WeakKeyDictionary()
def __init__(self):
self._logger = logging.getLogger('myscript.request')
def _task_name(self):
task = asyncio.current_task()
if task not in self._task_ids:
self._task_ids[task] = self._next_id()
return f'task-{self._task_ids[task]}'
def debug(self, *args, **kwargs):
self._logger.debug(*args, task_name=self._task_name(), **kwargs)
# the same for info, etc.
logger = TaskLogger()
Instead of creating a new logger, you may want to consider taking advantage of using custom attributes in the log message, via the extra
parameter:
For example:
FORMAT = '%(asctime)-15s %(unique_name)s %(message)s'
# [Configure/format loggers & handlers]
Then within the coroutine call logging a debug level message would look something like:
logger.debug('making request with args %r', args, extra={'unique_name': unique_name())
One other thing to keep in mind: unique_name()
could get expensive if you're making a lot of requests. A common pattern when creating concurrency via multiprocessing is to log the calling process ID via os.getpid()
. With asyncio
, perhaps a very rough cousin would be some identifier for the current Task
, which you could get to via asyncio.current_task()
. Each task has a _name
attribute that should be unique because it calls an incrementing _task_name_counter()
:
class Task(futures._PyFuture): # Inherit Python Task implementation
def __init__(self, coro, *, loop=None, name=None):
# ...
if name is None:
self._name = f'Task-{_task_name_counter()}'
else:
self._name = str(name)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With