I have a large dump file created by mongodump utility, for example "test.dump". I want get one exact collection from this dump, and manually read it into memory for further processing as valid BSON documents. I cannot load full dump in memory due to it's size.
I do not need physically restore anything to mongo instances! I basically even have none of them up and running. So mongorestore utility could be a solution only if can help me to read my collection from a dump file to memory.
I'm using Python 3 and pymongo, but can import another third-party libs if necessary or launch any CLI utilities with stdout results.
The mongodump files are just a bunch of BSON strings representing the documents from a collection.
import gzip, bson # bson package is from the pymongo library
with gzip.open('dump/test/hello.bson.gz', mode='rb') as f:
for doc in bson.decode_file_iter(f):
print(doc)
Documentation of bson.decode_file_iter(): https://pymongo.readthedocs.io/en/stable/api/bson/index.html#bson.decode_file_iter
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With