As I have explored, journal files created by Mongodb is compressed using snappy compression algorithm. but I am not able to decompress this compressed journal file. It gives an error on trying to decompress
Error stream missing snappy identifier
the python code I have used to decompress is as follows:
import collections
import bson
from bson.codec_options import CodecOptions
import snappy
from cStringIO import StringIO
try:
with open('journal/WiredTigerLog.0000000011') as f:
content = f.readlines()
fh = StringIO()
snappy.stream_decompress(StringIO("".join(content)),fh)
print fh
except Exception,e:
print str(e)
pass
please help i can't make my way after this
With journaling enabled, MongoDB writes the in-memory changes first to on-disk journal files. If MongoDB should terminate or encounter an error before committing the changes to the data files, MongoDB can use the journal files to apply the write operation to the data files and maintain a consistent state.
The long answer: No, deleting the journal file isn't safe. The idea of journalling is this: A write comes in. Now, to make that write persistent (and the database durable), the write must somehow go to the disk.
For 64-bit builds of mongod, journaling is enabled by default. To enable journaling, start mongod with the --journal command line option.
At every 100 milliseconds (See storage. journal. commitIntervalMs ). When WiredTiger creates a new journal file.
There's two forms of Snappy compression, the basic form and the streaming form. The basic form has the limitation that it all must fit in memory, so the streaming form exists to be able to compress larger amounts of data. The streaming format has a header and then subranges that are compressed. If the header is missing, it sounds like maybe you compressed using the basic form and are trying to uncompress with the streaming form. https://github.com/andrix/python-snappy/issues/40
If that is the case, use decompress
instead of stream_decompress
.
But if could be that the data isn't compressed at all:
with open('journal/WiredTigerLog.0000000011') as f:
for line in f:
print line
could work.
Minimum log record size for WiredTiger is 128 bytes. If a log record is 128 bytes or smaller, WiredTiger does not compress that record. https://docs.mongodb.com/manual/core/journaling/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With