Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to query on two arrays at the same time in MongoDB?

I have the following documents, where each document has fields values and dates, which are arrays. These arrays always have the same size in each document, meaning each value in dates corresponds to a value in values:

[
  {
    _id: "Stock1",
    values: [
      1,
      2,
      3
    ],
    dates: [
      ISODate("2000-01-01"),
      ISODate("2010-01-01"),
      ISODate("2020-01-01")
    ]
  },
  {
    _id: "Stock2",
    values: [
      4,
      5,
    ],
    dates: [
      ISODate("2000-01-01"),
      ISODate("2010-01-01")
    ]
  },
  {
    _id: "Stock3",
    values: [
      7,
      8,
      9
    ],
    dates: [
      ISODate("2000-01-01"),
      ISODate("2010-01-01"),
      ISODate("2020-01-01")
    ]
  }
]

I would like to query my documents such that I obtain values between dates "2010-01-01" and dates "2020-01-01" (included) for "Stock1" and "Stock3" only, i.e. I would like to end up with:

[
  {
    _id: "Stock1",
    values: [
      2,
      3
    ],
    dates: [
      ISODate("2010-01-01"),
      ISODate("2020-01-01")
    ]
  },
  {
    _id: "Stock3",
    values: [
      8,
      9
    ],
    dates: [
      ISODate("2010-01-01"),
      ISODate("2020-01-01")
    ]
  }
]

For the moment, I am doing the following:

db.collection.aggregate([
  {
    $match: {
      _id: {
        $in: [
          "Stock1",
          "Stock3"
        ]
      }
    }
  },
  {
    $unwind: {
      path: "$dates",
      includeArrayIndex: "date_index"
    }
  },
  {
    $match: {
      dates: {
        $gte: ISODate("2010-01-01"),
        $lte: ISODate("2020-01-01")
      }
    }
  },
  {
    $unwind: {
      path: "$values",
      includeArrayIndex: "value_index"
    }
  },
  {
    $match: {
      $expr: {
        $eq: [
          "$date_index",
          "$value_index"
        ]
      }
    }
  },
  {
    $project: {
      date_index: 0,
      value_index: 0
    }
  }
])

But I am not quite there yet. Also, the pipeline looks long and sub-optimal. Are there better ways to do this? Finally, I am first using unwind on dates, then filtering before using unwind again on values: This is to avoid a number of documents in the pipeline being too big as the array dates and values can be potentially huge.

Any help would appreciated!

like image 961
user142389358902 Avatar asked Oct 31 '25 23:10

user142389358902


1 Answers

Query

  • match the to keep only "Stock1","Stock3"
  • filter on indexes of dates (range (size "$dates")), to get only the indexes of elements with date in the range [2010-2020]
  • 2 map to get those filtered indexes from dates and values

*we could do all in like 1 reduce but it would be nested and more complicated code, also $concatArrays is slow and its not good idea to reduce array to array.The bellow is simpler and can work even for very big arrays.

PlayMongo

aggregate(
[{"$match": {"_id": { "$in": ["Stock1","Stock3"]}}}
 {"$set": 
    {"indexes": 
      {"$filter": 
        {"input": {"$range": [0, {"$size": "$dates"}]},
          "cond": 
          {"$and": 
            [{"$gte": 
                [{"$arrayElemAt": ["$dates", "$$this"]},
                  ISODate("2010-01-01T00:00:00Z")]},
              {"$lte": 
                [{"$arrayElemAt": ["$dates", "$$this"]},
                  ISODate("2020-01-01T00:00:00Z")]}]}}}}},
  {"$set": 
    {"dates": 
      {"$map": 
        {"input": "$indexes",
          "in": {"$arrayElemAt": ["$dates", "$$this"]}}}}},
  {"$set": 
    {"values": 
      {"$map": 
        {"input": "$indexes",
          "in": {"$arrayElemAt": ["$values", "$$this"]}}}}},
  {"$unset": ["indexes"]}])
like image 77
Takis Avatar answered Nov 03 '25 12:11

Takis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!