Python logging extremely slow on Linux server... but fast on Linux development VM?

Tags:

Using the exact same code, it takes about 50ms every call to a logger method (such as logger.debug) on the server, while on the dev machine it's less than 1ms. The loggers are outputting to files, with a little bit of formatting.

Other than the slowdowns with logging, the server is twice as fast.

I'm developing on Ubuntu 11.04 (Gnome) running inside VMWare on Windows 7. The server is running Ubuntu Server 11.04 (no GUI, pure console). The logging module is the official "logging" module ("import logging ... logger = logging.getLogger('mylogger')").

Any idea what could be causing this? It is extremely frustrating!

Thanks for any help!

EDIT: Both machines return "Python 2.7.1+" for their version. Both machines are running 64-bit Ubuntu.

Hard drive configuration the server is software RAID-1, while in the dev computer just a single drive.

EDIT2: Accepted Fabian's answer as it was thorough, although it didnt't quite solve the problem.

Solution: Writing to the console, period, is extremely slow. I tested writing X to file, and writing X to the console, and it was about 100x slower to the console. I don't know why that would be, but I just ran what I was running with ssh from another computer and everything was solved.

392

asked Jul 11 '12 19:07

Albeit

1 Answers

As noted in the comments, a possible reason is disk speed differences between the development VM and the production machine. Do you have the same kind of drives in both systems, eg. SSD, SATA vs. SCSI, spindle speed and cache, and so on? Your environments are quite different in terms of IO. Desktop Windows and VMWare will use aggressive disk caches while your production Linux machine is likely to err on the safe side and wait for data to be committed to disk more frequently. Maybe the Windows machine has drivers that are better suited for the kind of disk it has, while the server is running without optimisations? The file system differences are also significant, and the hardware is probably different enough to cause significant differences in IO speed. You could also have large differences in CPU and RAM speed. Desktop machines nowadays are often more focused on raw speed, while server hardware will focus more on reliability. You know your setup best, so you can probably compare the two systems in terms of hardware performance.

Beyond that, here's how you can find out what's really going on:

First, write a MWE to test logging. You should base it on your real code, and use logging in a similar way, but here's a small example:

import logging

logging.basicConfig(filename="test.log", level=logging.DEBUG)
logger = logging.getLogger("testlogger")

for i in range(0, 1000000):
    logger.info("iteration: %d", i)

Then run the script under cProfile both in your development and production machine. Be sure to log to the same filesystem as in your problem case, otherwise the results won't be applicable.

python -m cProfile testlogging.py

You will get output that looks like this:

57000501 function calls in 137.072 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.000    0.000 UserDict.py:4(__init__)
    1    0.000    0.000    0.000    0.000 __init__.py:1044(_fixupParents)
    1    0.000    0.000    0.000    0.000 __init__.py:1085(Logger)
    2    0.000    0.000    0.000    0.000 __init__.py:1100(__init__)
    1    0.000    0.000    0.000    0.000 __init__.py:1112(setLevel)
.....    .....    .....    .....    ..... ...............................

This should give you an idea of what is causing the slowness on the production machine. Things to look for in particular:

Look for lines reading {method 'write' of 'file' objects} and {method 'flush' of 'file' objects}. This will tell you how much time Python has spent writing to files and flushing the data to disk – in this case, the log file. Are there significant differences between the two machines? If so, it's definitely a difference in IO (disk) speed. You should then take a look at the disk setup of the server and see if there is anything you can do to get better disk performance.
Look for lines where the first percall column is particularly large. This column is the total time spent in the function divided by the number of calls to that function. Compare between the two machines, and you might find what's causing the difference.
Look for lines where the tottime column is particularly large. This column is the total time spent in the function. Again, compare between the two machines, and you may find some reasons for the speed difference.

If you find that disk IO seems to be the issue, you can do an additional test with just raw writing to files. You can probably find a benchmarking program that allows you to test disk throughput, but you could also just write a simple C (or Python) program that writes unformatted data to a file to make sure it really is pure disk performance that is the difference.

Just one last note: performance testing is, like programming, a mix of art, science, and engineering, and while there are patterns and advice you can follow, every case needs a bit of inventiveness to crack. So try things, make sure you don't fool yourself, and have fun! Good luck!

141

answered Oct 17 '22 05:10

Fabian Fagerholm

Related questions
                            
                                How can I make Django-Tastypie override a resource if it already exists?
                            
                                Pass a JSON object to an url with requests
                            
                                Reading data blocks from a file in Python
                            
                                Will Distribute be outdated when new packaging comes with Python 3.3?
                            
                                tkinter default button in a widget
                            
                                Moving from multiprocessing to threading
                            
                                No luck pip-installing pylint for Python 3
                            
                                Best way to modify and generalize spaced repetition software
                            
                                Protocol buffers python - unicode decode error
                            
                                Compound assignment operators in Python's Numpy library
                            
                                Is json.dumps guaranteed not to lose floating point precision?
                            
                                Placing child window relative to parent in Tkinter python
                            
                                strange Python function scope behavior
                            
                                give openid users additional information
                            
                                Which form of unicode normalization is appropriate for text mining?
                            
                                Dynamic list that automatically expands
                            
                                How to get hours difference from UTC to given timezone?
                            
                                How to access ODB files in Python 2.7
                            
                                interprocess C# python real time
                            
                                How to get a build a form with repeated elements well

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python logging extremely slow on Linux server... but fast on Linux development VM?

Tags:

performance

python

logging

console

ubuntu

Albeit

People also ask

1 Answers

Fabian Fagerholm

Recent Activity

Donate For Us