Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read mongodump manually

I have a large dump file created by mongodump utility, for example "test.dump". I want get one exact collection from this dump, and manually read it into memory for further processing as valid BSON documents. I cannot load full dump in memory due to it's size.

I do not need physically restore anything to mongo instances! I basically even have none of them up and running. So mongorestore utility could be a solution only if can help me to read my collection from a dump file to memory.

I'm using Python 3 and pymongo, but can import another third-party libs if necessary or launch any CLI utilities with stdout results.

like image 907
Nikolay Prokopyev Avatar asked Mar 09 '26 04:03

Nikolay Prokopyev


1 Answers

The mongodump files are just a bunch of BSON strings representing the documents from a collection.

import gzip, bson # bson package is from the pymongo library

with gzip.open('dump/test/hello.bson.gz', mode='rb') as f:
    for doc in bson.decode_file_iter(f):
        print(doc)

Documentation of bson.decode_file_iter(): https://pymongo.readthedocs.io/en/stable/api/bson/index.html#bson.decode_file_iter

like image 139
Messa Avatar answered Mar 10 '26 17:03

Messa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!