Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB query for array size > N fails, even with "$exists" check

Tags:

mongodb

I'm trying to write a query on the sample_mflix.movies collection to find all documents where the genres array has more than 3 elements. My attempts are failing unexpectedly.

My Goal: Find all movies with 3 or more genres.

Document Structure: A document that should match has a genres array like this:

{
  "title": "Example Movie with 4 Genres",
  "genres": ["Action", "Adventure", "Sci-Fi", "Thriller"]
}

A document that should not match might have fewer genres or no genres field at all:

{
  "title": "Example Movie with 2 Genres",
  "genres": ["Comedy", "Romance"]
}

{
  "title": "Example Movie with No Genres"
}

What I've Tried:

Attempt 1: Using $expr and $size

My first query was:

db.movies.find({
  $expr: { $gt: [ { $size: "$genres" }, 3 ] }
})

This query fails with the error: MongoServerError: The argument to $size must be an array, but was of type: missing. This makes sense, as some documents do not have the genres field.

Attempt 2: The recommended dot notation approach

To solve the previous error, I used the standard dot notation to check for the existence of the 4th element (index 3). This should be the most performant and correct way.

db.movies.find({ "genres.3": { "$exists": true } })

In the MongoDB Compass filter field, I am entering

{ "genres.3": { "$exists": true } }

Unexpected Result: When I run this query, the result is: the query does not give results

Attempt 3: Using $expr with an $isArray check

I also tried a more robust version of the first query to explicitly handle missing or non-array fields:

db.movies.find({
  $expr: {
    $and: [
      { $isArray: "$genres" },
      { $gt: [ { $size: "$genres" }, 3 ] }
    ]
  }
})

Unexpected Result: The result for this query is: the same error

My Environment:

MongoDB Server Version: 8.0.13

MongoDB Compass Version: 1.46.9

Database Hosting: MongoDB Atlas

My Question: Why are the standard solutions (especially the "field.index": { "$exists": true } check) failing to work in my environment? Is there a subtle issue with the data structure I might be missing, or could this be an environment-specific problem? What is the correct way to formulate this query?

Thank you for your help.

like image 398
Andrea Giovene Avatar asked Oct 24 '25 15:10

Andrea Giovene


2 Answers

I think aneroid's answer already explained your concerns well. I am just trying to provide another canonical way of handling potentially missing field.

I will suggest using $ifNull to fall back non-existent field to empty array. Then you can simply chain up $size and the result in a $expr to fetch array with size > N.

db.collection.aggregate([
  {
    "$match": {
      $expr: {
        $gte: [
          {
            $size: {
              "$ifNull": [
                "$genres",
                []
              ]
            }
          },
          3
        ]
      }
    }
  }
])

Mongo Playground

like image 150
ray Avatar answered Oct 26 '25 10:10

ray


The query with $and isArray + size check can fail because:

  1. the order of evaluation is not left-to-right
  2. the first or any initial criteria being false does not prevent the remaining criteria from being checked.

This may be quite surprising, at least I was. Not aware of any other language/framework that has this behaviour.

From the docs for $and, emphasis mine:

To allow the query engine to optimize queries, $and handles errors as follows:

  • If any expression supplied to $and would cause an error when evaluated alone, the $and containing the expression may cause an error but an error is not guaranteed.

  • An expression supplied after the first expression supplied to $and may cause an error even if the first expression evaluates to false.

So for an expression like:

{
  $and: [
    { $isArray: "$genres" },
    { $gt: [ { $size: "$genres" }, 3 ] }
  ]
}

The $size can still be checked even if $isArray is False.


Workarounds:

  1. Use $arrayElemAt with a check for not null and not undefined
    This way both things need to be true and neither check is dependent on the other.

    db.collection.find({
      $expr: {
        $and: [
          {
            // undefined when the array element does not exist
            $ne: [
              { $arrayElemAt: ["$genres", 3] },
              undefined
            ]
          },
          {
            // null when the field does not exist
            $ne: [
              { $arrayElemAt: ["$genres", 3] },
              null
            ]
          }
        ]
      }
    })
    

    Mongo Playground

    And for an aggregation pipeline, this translates 1-1 in a $match. Mongo Playground, aggregation

  2. The query with genres.3 works, (as commented above).

  3. If this was an aggregation pipeline, one could check isArray as a separate stage before the checking the size.

like image 22
aneroid Avatar answered Oct 26 '25 12:10

aneroid



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!