I am building a service where I log plain text format logs from several sources (one file per source). I do not intend to rotate these logs as they must be around forever.
To make these forever around files smaller I hope I could gzip them in fly. As they are log data, the files compress very well.
What is a good approach in Python to write append-only gzipped text files, so that the writing can be later resumed when service goes on and off? I am not that worried about losing few lines, but if gzip container itself breaks down and the file becomes unreadable that's no no.
Also, if it's no go, I can simply write them in as plain text without gzipping if it's not worth of the hassle.
To compress an existing file to a gzip archive, read text in it and convert it to a bytearray. This bytearray object is then written to a gzip file. In the example below, 'zen. txt' file is assumed to be present in current directory.
Note: On unix systems you should seriously consider using an external program, written for this exact task:
logrotate
(rotates, compresses, and mails system logs)You can set the number of rotations so high, that the first file would be deleted in 100 years or so.
In Python 2, logging.FileHandler
takes an keyword argument encoding
that can be set to bz2
or zlib
.
This is because logging
uses the codecs
module, which in turn treats bz2
(or zlib
) as encoding:
>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2") as fh:
... fh.write("Hello World\n")
$ bzcat on-the-fly-compressed.txt.bz2
Hello World
Python 3 version (although the docs mention bz2
as alias, you'll actually have to use bz2_codec
- at least w/ 3.2.3):
>>> import codecs
>>> with codecs.open("on-the-fly-compressed.txt.bz2", "w", "bz2_codec") as fh:
... fh.write(b"Hello World\n")
$ bzcat on-the-fly-compressed.txt.bz2
Hello World
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With