I need to sum the number of occurrences of an array. I need to output this to a collection but when I try and use the $out keyword, it fails with "can't use an array for _id\"
Is there any way to project the value of the _id field from the group stage into a new key and create a new _id?
db.djnNews_filtered.aggregate([
{$unwind:"$processed_text.headline_trigrams"},
{$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
{$sort:{"num":-1}}
])
{ "_id" : [ "Reports", "First", "Quarter" ], "num" : 279 }
{ "_id" : [ "ST", "upside", "prevails" ], "num" : 167 }
{ "_id" : [ "First", "Quarter", "Results" ], "num" : 160 }
{ "_id" : [ "Announces", "First", "Quarter" ], "num" : 155 }
db.djnNews_filtered.aggregate([
{$unwind:"$processed_text.headline_trigrams"},
{$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
{$sort:{"num":-1}},
{$out:"new_collection"}
])
assert: command failed: {
"errmsg" : "exception: insert for $out failed: { connectionId: 3, err: \"can't use an array for _id\", code: 2, n: 0, ok: 1.0 }",
"code" : 16996,
"ok" : 0
} : aggregate failed
In MongoDB, you can't have a document with an _id that is an array.
Can you simply $project the array to a different field?
db.djnNews_filtered.aggregate([
{$unwind:"$processed_text.headline_trigrams"},
{$group:{_id:"$processed_text.headline_trigrams","num":{$sum:1}}},
{$sort:{"num":-1}},
{$project: {trigram: "$_id", count: "$num"}},
{$out:"new_collection"}
])
Also, I'm not sure what your intention is with sorting it before inserting the documents into a collection. If the sort was only for looking at the data before you decided to add it to a collection, you might want to consider removing that step.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With