Fast import into MongoDb

Question

I have around 2 million strings with different lengths that I need to compress and put into MongoDb GridFS as files.

The strings are currently stored in MS SQL TEXT field of a table. I wrote a sample app to read each row, compress it and store it as a GridFS file.

There is one reader and a thread pool of 50 threads storing the results. It works but it is very slow (100 records per second on average).

I was wondering if there is any way for faster import into GridFS?

I'm using MongoDb 1.6 on Windows with MongoCSharp driver in C# and .NET.

Khash · Accepted Answer

I think I found the issue inside MongoDb CSharp driver by profiling it while running a very simple app that puts 1000 strings into 1000 GridFS files.

It turns out that 97% of the time is spent on checking if a file with the same filename exists in the collection. I added an index on the filename field and it's now blazing fast!

The question for me is if the driver needs to keep the filename unique and does a check, why doesn't it add a unique index to it if that's missing? What's the reason behind that?

Fast import into MongoDb

Tags:

c#

mongodb

gridfs

Khash

1 Answers

Khash

Recent Activity

Donate For Us

Fast import into MongoDb

Tags:

c#

mongodb

gridfs

Khash

1 Answers

Khash

Related questions

Recent Activity

Donate For Us