Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongoose limiting query to 1000 results when I want more/all (migrating from 2.6.5 to 3.1.2)

I'm migrating my app from Mongoose 2.6.5 to 3.1.2, and I'm running into some unexpected behavior. Namely I notice that query results are automatically being limited to 1000 records, while pretty much everything else works the same. In my code (below) I set a value maxIvDataPoints that limits the number of data points returned (and ultimately sent to the client browser), and that value was set elsewhere to 1500. I use a count query to determine the total number of potential results, and then a subsequent mod to limit the actual query results using the count and the value of maxIvDataPoints to determine the value of the mod. I'm running node 0.8.4 and mongo 2.0.4, writing server-side code in coffeescript.

Prior to installing mongoose 3.1.x the code was working as I had wanted, returning just under 1500 data points each time. After installing 3.1.2 I'm getting exactly 1000 data points returned each time (assuming there are more than 1000 data points in the specified range). The results are truncated, so that data points 1001 to ~1500 are the ones no longer being returned.

It seems there may be some setting somewhere that governs this behavior, but I can't find anything in the docs, on here, or in the Google group. I'm still a relative n00b so I may have missed something obvious.

DataManager::ivDataQueryStream = (testId, minTime, maxTime, callback) ->

    # If minTime and maxTime have been provided, set a flag to limit time extents of query
    unless isNaN(minTime)
    timeLimits = true

    # Load the max number of IV data points to be displayed from CONFIG
    maxIvDataPoints = CONFIG.maxIvDataPoints

    # Construct a count query to determine the number if IV data points in range
    ivCountQuery = TestDataPoint.count({})
    ivCountQuery.where "testId", testId

    if timeLimits
        ivCountQuery.gt "testTime", minTime
        ivCountQuery.lt "testTime", maxTime

    ivCountQuery.exec (err, count) ->

        ivDisplayQuery = TestDataPoint.find({})
        ivDisplayQuery.where "testId", testId

        if timeLimits
            ivDisplayQuery.gt "testTime", minTime
            ivDisplayQuery.lt "testTime", maxTime

        # If the data set is too large, use modulo to sample, keeping the total data series
        # for display below maxIvDataPoints
        if count > maxIvDataPoints
            dataMod = Math.ceil count/maxIvDataPoints

            ivDisplayQuery.mod "dataPoint", dataMod, 1

        ivDisplayQuery.sort "dataPoint" #, 1 <-- new sort syntax for Mongoose 3.x
        callback ivDisplayQuery.stream()
like image 411
Eli Avatar asked Sep 28 '12 15:09

Eli


2 Answers

You're getting tripped up by a pair of related factors:

  1. Mongoose's default query batchSize changed to 1000 in 3.1.2.
  2. MongoDB has a known issue where a query that requires an in-memory sort puts a hard limit of the query's batch size on the number of documents returned.

So your options are to put a combo index on TestDataPoint that would allow mongo to use it for sorting by dataPoint in this type of query or increase the batch size to at least the total count of documents you're expecting.

like image 66
JohnnyHK Avatar answered Sep 26 '22 09:09

JohnnyHK


Wow that's awful. I'll publish a fix to mongoose soon removing the batchSize default (was helpful when streaming large result sets). Thanks for the pointer.

UPDATE: 3.2.1 and 2.9.1 have been released with the fix (removed batchSize default).

like image 28
aaronheckmann Avatar answered Sep 25 '22 09:09

aaronheckmann