I'm trying to use MongoDBs (v.3.2.11) aggregation framework to process some log documents that look like the following:
{
"_id" : ObjectId("58b753c6d4421f00216de942"),
"session_id" : "7CB8725A-3994-45B8-9CA2-92FC19406288",
"event_type" : "connect_begin",
"timestamp" : "1488409541.674997",
"user_id" : "f6830aac-60be-44df-9fa7-7aa530d637ce",
"u_at" : ISODate("2017-03-01T23:05:42.077Z"),
"c_at" : ISODate("2017-03-01T23:05:42.077Z")
}
My collection contains pairs of the above logs that share a session_id
, one log for the begin
event and one for the end
event. The end goal is to calculate the length of these sessions via the difference in the timestamps.
So far I have been able to write an aggregation pipeline that groups the logs by $session_id
and provides an array of the two $events
associated with the session. My idea was that next I would $project
the begin and end timestamps onto the final result using a $cond
to check the event_type
of each $event
in the array, which will tell me if it is a begin
or end
event. I have pasted what I have so far below:
db.time_spent_logs.aggregate([
{ $group: {
_id: '$session_id',
events: {
$push: {
event_type: '$event_type',
timestamp: '$timestamp'
}
}
}},
{ $project: {
start: {
$cond: {
if: { $or: [ { $strcasecmp: [ "$events[0].event_type", "trending_begin" ]}, { $strcasecmp: [ "$events[0].event_type", "connect_begin" ]}] },
then: '$events[0].timestamp',
else: '$events[1].timestamp'
}
},
end: {
$cond: {
if: { $or: [ { $strcasecmp: [ "$events[0].event_type", "trending_end" ]}, { $strcasecmp: [ "$events[0].event_type", "connect_end" ]}] },
then: '$events[0].timestamp',
else: '$events[1].timestamp'
}
}
}}
])
This produces a list of the following:
{ "_id" : "4EC4B831-D3C7-49C6-9EC8-301981639ED7" }
I think my problem is in the if
of my $cond
, where I am comparing the value of the event_type
field for each $event
with a string to see if it is one of our two begin
or end
event types. I believe it is somewhere in this $if $or $strcasecompare
where I have something wrong...
I have attempted using $literal
to compare the event_type
as well to no result.
Any help would be very much appreciated!
For MongoDB v 3.2 and above you could use $filter instead of applying the conditions manually like this:
{
$project: {
start: {
//Filter the events, keep only 'begin' events
$filter: {
input: '$events',
as: 'event',
cond: {$in: ['$$event.event_type', ['trending_begin', 'connect_begin']]}
}
},
end: {
//Same with 'end' events
$filter: {
input: '$events',
as: 'event',
cond: {$in: ['$$event.event_type', ['trending_end', 'connect_end']]}
}
}
}
}
So the resulting 'start' and 'end' properties will be arrays of start and end events respectively. If you're sure that the data is consistent and you have exactly 2 event (start and end) records matching the session then you can safely use $arrayElemAt to take the first element of the array:
{
$project: {
start: {
//Take first of the filtered events
$arrayElemAt: [{
$filter: {
input: '$events',
as: 'event',
cond: {$in: ['$$event.event_type', ['trending_begin', 'connect_begin']]}
}
}, 0]
},
end: {
//Take first of the filtered events
$arrayElemAt: [{
$filter: {
input: '$events',
as: 'event',
cond: {$in: ['$$event.event_type', ['trending_end', 'connect_end']]}
}
}, 0]
}
}
}
And you'll have 'start' and 'end' as plain objects. Here is the whole query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With