Is it possible to write union queries in Mongo DB using 2 or more collections similar to SQL queries?
I'm using spring mongo template and in my use case, I need to fetch the data from 3-4 collections based on some conditions. Can we achieve this in a single operation?
For example, I have a field named "circuitId" which is present in all 4 collections. And I need to fetch all records from all 4 collections for which that field matches with a given value.
Doing unions in MongoDB in a 'SQL UNION' fashion is possible using aggregations along with lookups, in a single query.
Something like this:
db.getCollection("AnyCollectionThatContainsAtLeastOneDocument").aggregate(
[
{ $limit: 1 }, // Reduce the result set to a single document.
{ $project: { _id: 1 } }, // Strip all fields except the Id.
{ $project: { _id: 0 } }, // Strip the id. The document is now empty.
// Lookup all collections to union together.
{ $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } },
{ $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } },
{ $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } },
// Merge the collections together.
{
$project:
{
Union: { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] }
}
},
{ $unwind: "$Union" }, // Unwind the union collection into a result set.
{ $replaceRoot: { newRoot: "$Union" } } // Replace the root to cleanup the resulting documents.
]);
Here is the explanation of how it works:
Instantiate an aggregate
out of any collection of your database that has at least one document in it. If you can't guarantee any collection of your database will not be empty, you can workaround this issue by creating in your database some sort of 'dummy' collection containing a single empty document in it that will be there specifically for doing union queries.
Make the first stage of your pipeline to be { $limit: 1 }
. This will strip all the documents of the collection except the first one.
Strip all the fields of the remaining document by using $project
stages:
{ $project: { _id: 1 } },
{ $project: { _id: 0 } }
Your aggregate now contains a single, empty document. It's time to add lookups for each collection you want to union together. You may use the pipeline
field to do some specific filtering, or leave localField
and foreignField
as null to match the whole collection.
{ $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } },
{ $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } },
{ $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } }
You now have an aggregate containing a single document that contains 3 arrays like this:
{
Collection1: [...],
Collection2: [...],
Collection3: [...]
}
You can then merge them together into a single array using a $project
stage along with the $concatArrays
aggregation operator:
{
"$project" :
{
"Union" : { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] }
}
}
You now have an aggregate containing a single document, into which is located an array that contains your union of collections. What remains to be done is to add an $unwind
and a $replaceRoot
stage to split your array into separate documents:
{ $unwind: "$Union" },
{ $replaceRoot: { newRoot: "$Union" } }
Voilà. You know have a result set containing the collections you wanted to union together. You can then add more stages to filter it further, sort it, apply skip() and limit(). Pretty much anything you want.
Starting Mongo 4.4
, the aggregation framework provides a new $unionWith
stage, performing the union of two collections (the combined pipeline results from two collections into a single result set).
Thus, in order to combine documents from 3 collections:
// > db.collection1.find()
// { "circuitId" : 12, "a" : "1" }
// { "circuitId" : 17, "a" : "2" }
// { "circuitId" : 12, "a" : "5" }
// > db.collection2.find()
// { "circuitId" : 12, "b" : "x" }
// { "circuitId" : 12, "b" : "y" }
// > db.collection3.find()
// { "circuitId" : 12, "c" : "i" }
// { "circuitId" : 32, "c" : "j" }
db.collection1.aggregate([
{ $match: { circuitId: 12 } },
{ $unionWith: { coll: "collection2", pipeline: [{ $match: { circuitId: 12 } }] } },
{ $unionWith: { coll: "collection3", pipeline: [{ $match: { circuitId: 12 } }] } }
])
// { "circuitId" : 12, "a" : "1" }
// { "circuitId" : 12, "a" : "5" }
// { "circuitId" : 12, "b" : "x" }
// { "circuitId" : 12, "b" : "y" }
// { "circuitId" : 12, "c" : "i" }
This:
collection1
collection2
into the pipeline with the new $unionWith
stage. The pipeline
parameter is an optional aggregation pipeline applied on documents from the collection being merged before the merge happens.collection3
into the pipeline with the same $unionWith
stage.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With