Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to dump a collection to json file using pymongo

I am trying to dump a collection to .json file but after looking in pymongo tutorial I can not find any thing that relates to it.

Tutorial link: https://api.mongodb.com/python/current/tutorial.html

like image 635
AnhNg Avatar asked Mar 07 '18 13:03

AnhNg


3 Answers

The accepted solution produces an invalid JSON. It results in trailing comma , before the close square bracket ]. The JSON spec does not allow trailing commas. See this answer and this reference.

To build on the accepted solution I used the following:

from bson.json_util import dumps
from pymongo import MongoClient
import json

if __name__ == '__main__':
    client = MongoClient()
    db = client.db_name
    collection = db.collection_name
    cursor = collection.find({})
    with open('collection.json', 'w') as file:
        json.dump(json.loads(dumps(cursor)), file)
like image 144
garyj Avatar answered Oct 11 '22 09:10

garyj


Just get all documents and save them to file e.g.:

from bson.json_util import dumps
from pymongo import MongoClient

if __name__ == '__main__':
    client = MongoClient()
    db = client.db_name
    collection = db.collection_name
    cursor = collection.find({})
    with open('collection.json', 'w') as file:
        file.write('[')
        for document in cursor:
            file.write(dumps(document))
            file.write(',')
        file.write(']')
like image 34
kamillitw Avatar answered Oct 11 '22 11:10

kamillitw


Here's another way of not saving a , before the closing square brackets. Also using with open to save some space.

filter = {"type": "something"}
type_documents = db['cluster'].find(filter)
type_documents_count = db['cluster'].count_documents(filter)

with open("type_documents.json", "w") as file:
    file.write('[')
    # Start from one as type_documents_count also starts from 1.
    for i, document in enumerate(type_documents, 1):
        file.write(json.dumps(document, default=str))
        if i != type_documents_count:
            file.write(',')
    file.write(']')

It basically doesn't write the comma if number of iterations is equal to the number of documents (which is the last document it saves).

like image 28
robscott Avatar answered Oct 11 '22 11:10

robscott