Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB C# driver IAsyncCursor<BsonDocument> behavior?

Tags:

c#

mongodb

I'm new to MongoDB (and its dotnet core C# driver), and I have the following question regarding the IAsyncCursor behavior:

From the official documentation: https://docs.mongodb.com/getting-started/csharp/query/, it seems that the recommended way of iterating through the IAsyncCursor is:

var collection = _database.GetCollection<BsonDocument>("restaurants");
var filter = new BsonDocument();
var count = 0;
using (var cursor = await collection.FindAsync(filter))
{
    while (await cursor.MoveNextAsync())
    {
        var batch = cursor.Current;
        foreach (var document in batch)
        {
            // process document
            count++;
        }
    }
}

However, it seems that to get the "current" batch of the returned documents, the while loop first "MoveNextAsync", does "MoveNextAsync" skips the "current" batch? Or logically, would the following modified code snippet makes more sense?

var collection = _database.GetCollection<BsonDocument>("restaurants");
var filter = new BsonDocument();
var count = 0;
using (var cursor = await collection.FindAsync(filter))
{
    do
    {
        if (cursor.Current != null)
        {
            var batch = cursor.Current;
            foreach (var document in batch)
            {
                // process document
                count++;
            }
        }
    }
    while (await cursor.MoveNextAsync())
}

My understanding is that the cursor should start by pointing to the "current" batch already (if any), and I should first work on whatever the "current" batch is, and then move to the next batch of documents, if any.

But for all the sources I can find online, it seems the iteration always do "MoveNext" first, then work on the batch - this gives me the impression that the IAsyncCursor as returned by FindAsync starts off pointing to a position one prior to the actual "current" (or first) batch of the documents, and "MoveNext" is necessary to be called first to move the cursor to point to the actual current.

From the coding point of view, calling "MoveNext" first makes the while loop more consistent, so my own code snippet doesn't have to (redundantly) check for the validity of the "current" inside the body of "do".

However, I do find that "IAsyncCursor.First()" does return the "first" document - I'm guessing now that the "First()" method actually does a "MoveNext" internally, and returns the first document of the "current" batch.

Also, as I'm using "FindAsync", and if the document is not found based on my filter, is the returned IAsyncCursor "null" or "MoveNext" will return false? Can I assume that IAsyncCursor as returned by FindAsync is always a valid object, so I don't have to excessively check for null, and only need to check the return of "MoveNext()" or "First()"?

Could you MongoDB experts shed your insights into this?

Thanks!

like image 444
Dejavu Avatar asked Oct 20 '25 15:10

Dejavu


1 Answers

The first code sample is correct and doesn't skip the first batch. However, you only need to directly use MoveNextAsync if you want explicit control of fetching batches.

Otherwise, it's simpler to use ForEachAsync which wraps that complexity for you:

using (var cursor = await collection.FindAsync(filter))
{
    await cursor.ForEachAsync(document =>
    {
        // process document
        count++;
    }
}

See the ForEachAsync source here.

As shown in the source, ForEachAsync takes ownership of the cursor and disposes it for you so you can also omit your own using if you like.

like image 171
JohnnyHK Avatar answered Oct 22 '25 05:10

JohnnyHK