Python logging causing latencies?

Tags:

I'm developing a real-time REST API using Python3+bottle/UWSGI. I was experiencing latencies in my code, sometimes 100s of ms, which does matter in my app.

Using logging module, I was trying to identify slow parts of my code, printing how long individual codeblocks took to run. I know this is an extremely poor way of profiling the code, but sometimes it is able to do the job quite well.

Even though I identified some slow parts, I was still missing something -- the individual parts seemed to take 10s of ms, but very often they took 100s of ms as the whole. After some increasingly insane experiments which drove me almost completely nuts, I've come to the following:

t = round(100*time.time()) logging.info('[%s] Foo' % t) logging.info('[%s] Bar' % t)

Surprisingly, it gives:

2014-07-16 23:21:23,531  [140554568353] Foo 2014-07-16 23:21:24,312  [140554568353] Bar

Even though this seems hard to believe, there are two consequent logging.info() calls, and for some reason, there is a gap of almost 800 ms between them. Can anybody tell me what is going on? It is noteworthy that if there are multiple info() calls, the latency only appears once in the whole API method, most frequently at its very beginning (after first call). My only hypothesis are disk latencies, there are several (but not so many!) workers running simultaneously and I'm using rotational disk, no SSD. But I thought there are buffers and that the OS will do the disk-flush asynchronously for me. Am I wrong in my assumptions? Should I avoid logging completely to avoid the latencies?

EDIT

Based on Vinay Sajip's suggestion, I switched to the following initialization code:

log_que = queue.Queue(-1) queue_handler = logging.handlers.QueueHandler(log_que) log_handler = logging.StreamHandler() queue_listener = logging.handlers.QueueListener(log_que, log_handler) queue_listener.start() logging.basicConfig(level=logging.DEBUG, format="%(asctime)s  %(message)s", handlers=[queue_handler])

It seems like it works fine (if I comment queue_listener.start(), there is no output), but the very same latencies still appear. I don't see how is it even possible, the call should be non-blocking. I also put gc.collect() at the end of each request to ensure the problem is not caused by the garbage collector -- without any effect. I've also tried to turn off the logging for the whole day. The latencies disappeared, so I think their source must be in the logging module...

427

asked Jul 16 '14 21:07

Tregoreg

2 Answers

You can use asynchronous handlers (QueueHandler and corresponding QueueListener, added in Python 3.2, and described in this post) and do the I/O processing of your logging events in a separate thread or process.

answered Oct 03 '22 00:10

Vinay Sajip

Try async logging provided by Logbook

As hasan has proposed, async log handler can be the way to go.

Recently I have tried using Logbook and can say, it will provide you all what you need for this - ZeroMQHandler as well as ZeroMQSubscriber

answered Oct 02 '22 23:10

Jan Vlcinsky

Related questions
                            
                                deleting entries in a dictionary based on a condition
                            
                                tkinter app adding a right click context menu?
                            
                                Python: logging.streamhandler is not sending logs to stdout
                            
                                ImportError: No module named django.core.management when using manage.py
                            
                                Creating numpy linspace out of datetime
                            
                                How can I create a slice object for Numpy array?
                            
                                How to delete rows in python pandas DataFrame using regular expressions?
                            
                                Beautifulsoup multiple class selector
                            
                                how to convert monthly data to quarterly in pandas
                            
                                bcrypt.checkpw returns TypeError: Unicode-objects must be encoded before checking
                            
                                How can I process command line arguments in Python?
                            
                                Does Python have a module for parsing HTTP requests and responses?
                            
                                Fixing color in scatter plots in matplotlib
                            
                                How to call a static methods on a django model class during a south migration
                            
                                How to avoid infinite recursion with super()?
                            
                                Sphinx and argparse - autodocumenting command line scripts?
                            
                                Avoiding infinite loops in __getattribute__ [duplicate]
                            
                                Using buttons in Tkinter to navigate to different pages of the application?
                            
                                Python regex, remove all punctuation except hyphen for unicode string
                            
                                Pandas: Get values from column that appear more than X times

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python logging causing latencies?

Tags:

performance

python

logging

latency