I'm new to ArangoDB, having trouble optimizing my queries and am hoping for some help.
The query I've provided below is a real example that I'm struggling with, 758.078 ms on my dev database but on staging, with a much larger dataset, it takes 531.511 s.
I'll also provide the size of each of the edge tables I'm traversing in dev and staging. Any help is really appreciated.
for doc in document
filter repo._key == "my-key"
for v, e, p in 3 any doc edge1, edge2, edge3
options {uniqueVertices: 'global', bfs: true}
filter DATE_ISO8601(p.vertices[2].date) > DATE_ISO8601("2017-09-04T00:00:01Z")
and DATE_ISO8601(p.vertices[2].date) < DATE_ISO8601("2017-09-15T23:59:59Z")
limit 1
return {
commit: p.vertices[2].hash,
date: p.vertices[2].date,
message: p.vertices[2].message,
author: p.vertices[1].email,
loc: p.vertices[3].stats.additions
}
DEV
STAGING
The query is potentially slow because the filter condition
filter
DATE_ISO8601(p.vertices[2].date) > DATE_ISO8601("2017-09-04T00:00:01Z"
and
DATE_ISO8601(p.vertices[2].date) < DATE_ISO8601("2017-09-15T23:59:59Z")
is not applied during the traversal but only afterwards.
Potentially this is due to the function calls (to DATE_ISO8601) in the filter conditions. If your date values are stored as numbers, can you check whether the following filter condition speeds up the query:
filter
p.vertices[2].date > DATE_TIMESTAMP("2017-09-04T00:00:01Z"
and
p.vertices[2].date < DATE_TIMESTAMP("2017-09-15T23:59:59Z")
That modified filter condition should allow pulling the filter condition inside the traversal, so it is executed earlier.
You can verify the query execution plans using db._explain(<query string goes here>); in the ArangoShell or from the web interface's AQL editor.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With