I have some trouble importing a JSON file to a local MongoDB instance. The JSON was generated using mongoexport
and looks like this. No arrays, no hardcore nesting:
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"[email protected]","type":"answer"}
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"[email protected]","type":"answer"}
If I import a 9MB file with ~300 rows, there is no problem:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails-small.json
2015-11-02T10:03:11.353+0100 connected to: localhost
2015-11-02T10:03:11.372+0100 imported 240 documents
But if try to import a 32MB file with ~1300 rows, the import fails:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails.json
2015-11-02T10:05:25.228+0100 connected to: localhost
2015-11-02T10:05:25.735+0100 error inserting documents: lost connection to server
2015-11-02T10:05:25.735+0100 Failed: lost connection to server
2015-11-02T10:05:25.735+0100 imported 0 documents
Here is the log:
2015-11-02T11:53:04.146+0100 I NETWORK [initandlisten] connection accepted from 127.0.0.1:45237 #21 (6 connections now open)
2015-11-02T11:53:04.532+0100 I - [conn21] Assertion: 10334:BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
2015-11-02T11:53:04.536+0100 I NETWORK [conn21] AssertionException handling request, closing client connection: 10334 BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
I've heard about the 16MB limit for BSON documents before, but since no row in my JSON file is bigger than 16MB, this shouldn't be a problem, right? When I do the exact same (32MB) import one my local computer, everything works fine.
Any ideas what could cause this weird behaviour?
You can use the --collection (or -c ) parameter to specify a collection to import the file into.
Default behavior says skip if already exists so by default it wont overwrite existing data. But you can update it using --upsert flag.
The mongoimport tool imports content from an Extended JSON, CSV, or TSV export created by mongoexport , or potentially, another third-party export tool. Run mongoimport from the system command line, not the mongo shell.
I guess the problem is about performance, any way you can solved used:
you can use mongoimport option -j. Try increment if not work with 4. i.e, 4,8,16, depend of the number of core you have in your cpu.
mongoimport --help
-j, --numInsertionWorkers= number of insert operations to run concurrently (defaults to 1)
mongoimport -d mietscraping -c mails -j 4 < mails.json
or you can split the file and import all files.
I hope this help you.
looking a little more, is a bug in some version https://jira.mongodb.org/browse/TOOLS-939 here another solution you can change the batchSize, for default is 10000, reduce the value and test:
mongoimport -d mietscraping -c mails < mails.json --batchSize 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With