I am trying to deal with time series using MongoDB. The common solution adopted by community is to use subdocuments to store information at different level of granularity (see Schema Design for Time Series Data in MongoDB).
For example, take a look at the following document:
{
timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"),
type: “memory_used”,
values: [
999999, // 1 second
…
1000000, // nth second
1500000, // n+1th second
…
2000000 // 60th
]
}
The document is indexed by minute information and contains a subdocument which store more detailed information for each second.
So far so good. This kind of approach requires an optimization to work properly:
Another optimization [..] is preallocating all documents for the upcoming time period; This never causes an existing document to grow or be moved on disk.
To implement the above optimization one could use the $setOnInsert
property on the update
method.
db.getCollection('aCollection').update(
{
timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"),
type: “memory_used”
},
{
$setOnInsert: { values: {'0': 0, '1': 0, '2': 0}},
$inc: {"values.30": 1}
},
{ upsert: true }
)
The problem is that it is not possible to use the same field in the same update in two different operation. The above update istruction generates the following error:
Cannot update 'values' and 'values.30' at the same time
This problem is tracked on this issue.
My question is: is there any workaround? I prefix that I can't use any batch that preallocates empty documents, because I can't know the value of the indexed fields a priori (in the above example, the values of the field type
.
Thanks in advance.
Me and my colleagues found a workaround. We can call it three step initialization.
Remember that MongoDB guarantees the atomicity of operations on a single document. With this fact in mind we can operate in the following way:
The code should looks like something similar to the following code snippet.
// Firt of all, try the update
var result = db.test.update(
{timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"), type: “memory_used”},
{$inc: {"values.39": 1}},
{upsert: false}
);
// If the update do not succeed, then try to insert the document
if (result.nModified === 0) {
try {
db.test.insert(/* Put here the whole document */);
} catch (err) {
console.log(err);
}
// Here we are sure that the document exists.
// Retry to execute the update statement
db.test.update(/* Same update as above */);
}
The above procedure works if a precondition holds: _id
value should be derived from other fields in the document. In our example, _id
value would be '2013-10-10T23:06:00.000Z-memory_used
. Only using this technique, the insert at point 2. will properly fail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With