Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing binary data to a python logger

Tags:

python

logging

I want to log raw bytes. But if I change the file mode in FileHandler from "w" to "wb" the logger fails with error, whichever data I pass to it: string or bytes.

logging.getLogger("clientIn").error(b"bacd")

Traceback (most recent call last):
  File "/usr/lib/python3.4/logging/__init__.py", line 980, in emit
    stream.write(msg)
TypeError: 'str' does not support the buffer interface
Call stack:
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.4/multiprocessing/spawn.py", line 119, in _main
    return self._bootstrap()
  File "/usr/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
    self.run()
  File "/usr/lib/python3.4/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/serj/work/proxy_mult/proxy/connection_worker_process.py", line 70, in __call__
    self._do_work(ipc_socket)
  File "/home/serj/work/proxy_mult/proxy/connection_worker_process.py", line 76, in _do_work
    logging.getLogger("clientIn").error("bacd")
Message: 'bacd'

I need the way to adapt logging module to binary data.

like image 530
VladimirLenin Avatar asked May 30 '26 18:05

VladimirLenin


1 Answers

The easiest solution would be to store the bytes in a bytestring.

The other possible way is to customize your logging. The documentation is a start but you will need to look into examples of how people have done it. Personally I have gone only as far as to using a slightly customized record, handler and formatter for allowing my logger to use a SQLite backend.

There are multiple things you need to modify (sorry for not being that specific but I am also still a beginner when it comes to the logging module of Python):

  • LogRecord - if you inherit from it, you will see that the __init__(...) specifies an argument msg of type object. As the documentation states msg is the event description message, possibly a format string with placeholders for variable data. Imho if msg was supposed to be just a string it would not have been of type object. This is a place, where you can investigate further incl. the use of args. Inheriting is not really necessary for many cases and a simple namedtuple would do just fine.

  • LoggerAdapter - there is the contextual information of a message, which can contain arbitrary data (from what I understand). You will need a custom adapter to work with that.

In addition you will probably have to use a custom Formatter and/or Handler. Worst case you will have to use some arbitrary string message while passing the extra data (binary or otherwise) alongside it.

Here is a quick and dirty example, where I use a namedtuple to hold the extra data. Note that I was unable to just pass the extra data without an actual message but you might be able to go around this issue if you implement your actual custom LogRecord. Also note that I am omitting the rest of my code since this is just a demonstration for customization:

TensorBoardLogRecord = namedtuple('TensorBoardLogRecord' , 'dtime lvl src msg tbdata')
TensorBoardLogRecordData = namedtuple('tbdata', 'image images scalar scalars custom_scalars')


class TensorBoardLoggerHandler(logging.Handler):
    def __init__(self, level=logging.INFO, tboard_dir='./runs') -> None:
        super().__init__(level)
        self.tblogger = SummaryWriter(tboard_dir)

    def emit(self, record: TensorBoardLogRecord) -> None:
        # For debugging call print record.__dict__ to see how the record is structured
        # If record contains Tensorboard data, add it to TB and flush
        if hasattr(record, 'args'):
            # TODO Do something with the arguments

    ...

class TensorBoardLogger(logging.Logger):
    def __init__(self, name: str='TensorBoardLogger', level=logging.INFO, tboard_dir='./runs') -> None:
        super().__init__(name, level)
        self.handler = TensorBoardLoggerHandler(level, tboard_dir)
        self.addHandler(self.handler)

    ...


logging.setLoggerClass(TensorBoardLogger)
logger = logging.getLogger('TensorBoardLogger')
logger.info('Some message', TensorBoardLogRecordData(None, None, 10000, None, None))

What I am trying to do is add the ability to the logger (still work in progress) to actually write a Tensorboard (in my case from the PyTorch utilities module) log entry that can be visualized via the tool inside the web browser. Yours doesn't need to be that complicated. This "solution" is mostly in case you can't find a way to override the msg handling.

I found also this repository - visual logging, which uses the logging facilities of the Python module to handle images. Following the code provided by the repo I was able to get

<LogRecord: TensorBoardLogger, 20, D:\Projects\remote-sensing-pipeline\log.py, 86, "TensorBoardLogRecord(image=None, images=None, scalar=1, scalars=None, custom_scalars=None)">

{'name': 'TensorBoardLogger', 'msg': TensorBoardLogRecord(image=None, images=None, scalar=1, scalars=None, custom_scalars=None), 'args': (), 'levelname': 'INFO', 'levelno': 20, 'pathname': 'D:\\Projects\\remote-sensing-pipeline\\log.py', 'filename': 'log.py', 'module': 'log', 'exc_info': None, 'exc_text': None, 'stack_info': None, 'lineno': 86, 'funcName': '<module>', 'created': 1645193616.9026344, 'msecs': 902.6343822479248, 'relativeCreated': 834.2068195343018, 'thread': 6508, 'threadName': 'MainThread', 'processName': 'MainProcess', 'process': 16208}

by just calling

logger = TensorBoardLogger(tboard_dir='./LOG')
logger.info(TensorBoardLogRecord(image=None, images=None, scalar=1, scalars=None, custom_scalars=None))

where I changed TensorBoardLogRecord to be

TensorBoardLogRecord = namedtuple('TensorBoardLogRecord' , 'image images scalar scalars custom_scalars')

As you can see the msg is my object TensorBoardLogRecord, which confirms both my statement above as well as the statement in the documentation - as long as you customize your logging properly, you can log whatever you want. In the case of the repo I've pointed at the author is using images, which are numpy objects. However ultimately those images are read from image files hence binary data is also there.

like image 133
rbaleksandar Avatar answered Jun 02 '26 08:06

rbaleksandar