Ive used MySQL since I was a young boy and now I have to switch to MongoDB for various reasons.
I write a log which saves every php error in a MongoDB collection. Reading out the errors is not the problem, using a simple find() is pretty easy and it's great that you can use php arrays to get to the data.
Now I want to have some statistics about the errors. My Collection looks like so:
{
"_id": ObjectId("51af10ca0e5e723c0a000000"),
"Errors": {
"2048": {
"0": {
"Message": "Declaration of ADODB_mysqli::MetaIndexes() should be compatible with ADOConnection::MetaIndexes($table, $primary = false, $owner = false)",
"File": "File.php",
"Line": NumberInt(29),
"Time": NumberInt(1370427591)
}
},
"2": {
"0": {
"Message": "Error",
"File": "File.php",
"Line": NumberInt(29),
"Time": NumberInt(1370427591)
},
"1": {
"Message": "Error",
"File": "File.php",
"Line": NumberInt(29),
"Time": NumberInt(1370427591)
}
},
"8": {
"0": {
"Message": "Undefined index: PluginLastAdded",
"File": "File.php",
"Line": NumberInt(36),
"Time": NumberInt(1370427594)
},
"1": {
"Message": "Undefined index: PluginLastAdded",
"File": "File.php",
"Line": NumberInt(36),
"Time": NumberInt(1370427594)
}
}
}
}
Now I want to know how often each error in this entry occours. It would be great to have a seperated list into 2048, 2, 8 and then the count of each error.
Is that possible without much php code but using MongoDB's aggregation?
Any help would be great, MongoDB is 180° different than MySQL in my opinion, and the switch is pretty hard.
To build upon what Sammaye mentioned above, the following schema with real arrays would be more suitable:
{
"_id": ObjectId("51af10ca0e5e723c0a000000"),
"errors": [
{
"code": 2048,
"message": "Declaration of ADODB_mysqli::MetaIndexes() should be compatible with ADOConnection::MetaIndexes($table, $primary = false, $owner = false)",
"file": "File.php",
"line": NumberInt(29),
"time": NumberInt(1370427591)
},
{
"code": 2,
"message": "Error",
"file": "File.php",
"line": NumberInt(29),
"time": NumberInt(1370427591)
},
{
"code": 2,
"message": "Error",
"file": "File.php",
"line": NumberInt(29),
"time": NumberInt(1370427591)
},
{
"code": 8,
"message": "Undefined index: PluginLastAdded",
"file": "File.php",
"line": NumberInt(36),
"time": NumberInt(1370427594)
},
{
"code": 8,
"message": "Undefined index: PluginLastAdded",
"file": "File.php",
"line": NumberInt(36),
"time": NumberInt(1370427594)
}
]
}
The array structure also makes indexing and querying much more straightforward. Indexes are capable indexing array values, and MongoDB also makes it easy to query on arrays. For instance, you have the flexibility to query for particular errors (perhaps a combination of code and file) with $elemMatch. Additionally, since errors is a real array, you can use various update operators, such as $push or $pull.
One thing to consider is that nested objects restrict the way you index and write queries. In your previous example, the only way to query for the first error message is by Errors.2048.0.Message, but using the schema above would allow for a query on errors.message.
Arrays also makes the Aggregation Framework a viable option for you, particularly since it will allow you to iterate through arrays with its $unwind operator and then $group on values within array elements. In addition to the MongoDB documentation on the aggregation framework, you may find this presentation helpful, goes through the different operators visually.
To your earlier question on obtaining a count of the number of errors per code, the following aggregation framework pipeline would calculate that across all documents in the entire collection:
db.foo.aggregate([
{ $unwind: "$errors" },
{ $group: {
_id: "$errors.code",
num: { $sum: 1 }
}}
]);
Lastly, I would suggest storing the time field as a BSON date (MongoDate in PHP) instead of an integer. That opens up the option of using date operators in the aggregation framework.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With