Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ArangoDB aggregation counts of objects in array

I'm trying to generate facets (aggregation counts) for the following documents in a graph (based on collections rather than a named graph):

{
  "relation": "decreases",
  "edge_type": "primary",
  "subject_lbl": "act(p(HGNC:AKT1), ma(DEFAULT:kin))",
  "object_lbl": "act(p(HGNC:CDKN1B), ma(DEFAULT:act))",
  "annotations": [
    {
      "type": "Disease",
      "label": "cancer",
      "id": "cancer"
    },
    {
      "type": "Anatomy",
      "label": "liver",
      "id": "liver"
    }
  ]
}

The following works great to get facets (aggregation counts) for the edge_type:

FOR doc in edges
COLLECT 
    edge_type = doc.edge_type WITH COUNT INTO edge_type_cnt
RETURN {edge_type, edge_type_cnt}

I tried the following to get counts for the annotations[*].type value:

FOR doc in edges
COLLECT 
    edge_type = doc.edge_type WITH COUNT INTO edge_type_cnt,
    annotations = doc.annotations[*].type WITH COUNT INTO anno_cnt
RETURN {edge_type, edge_type_cnt, annotations, anno_cnt}

Which results in an error - any ideas what I'm doing wrong? Thanks!

like image 489
William Avatar asked May 31 '26 17:05

William


1 Answers

Thanks to this thread: https://groups.google.com/forum/#!topic/arangodb/vNFNVrYo9Yo linked to from this Question: ArangoDB Faceted Search Performance pointed me in the right direction.

FOR doc in edges
    FOR anno in doc.annotations
    COLLECT anno_type = anno.type WITH COUNT INTO anno_cnt
RETURN {anno_type, anno_cnt}

Results in:

Anatomy 4275
Cell  2183
CellLine  2093
CellStructure 2081
Disease 2126
Organism  2075
TextLocation  2121

Looping over the edges and then the annotations array is the key that I was missing.

like image 197
William Avatar answered Jun 03 '26 06:06

William