Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gremlin - Traverse to leaf nodes in tree graph

I have tree data structure in graph as shown in below diagram. Each color represents node with different labels with relation like employee -> app -> project -> pv -> scan).

Question #1:

I want to find all leaf nodes (ones in green) of top node 0.

I tried below code with loop which returns all nodes with label employee. Not just leaf nodes.

g.V().has('person', 'id', '0').repeat(__.in('reportsTo')).emit().values('id')

Sample graph can be found in gremlinbin.

How do I find all green leaf nodes?

Update #1:

As mentioned in comments, I tried tree pattern. But it doesn't let me call getLeafObjects() on tree. Not sure what's missing. Also, again I am able to create tree of employee nodes only. How to traverse to scan nodes?

> tree = g.V().has('person', 'id', '0').repeat(__.in('reportsTo')).emit().tree()
>  tree.getLeafObjects()
No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal.getLeafObjects() is applicable for argument types: () values: []

Question #2:

How do I retrieve a child vertex amongst children under each parent based on max(id)? So in my sample graph, each black vertex can have one or more green child vertex. I want to find the green vertices with max(property) under each black vertices.

enter image description here

like image 752
indusBull Avatar asked Oct 20 '25 14:10

indusBull


1 Answers

I think you just need to modify your emit(). Without an argument, that's saying to emit everything from the repeat(). If you only want leaf vertices, then include something like: not(outE()) which basically says only emit if there are no outgoing edges on the vertex which would mean it's a leaf vertex. You might need to make your specific emit() predicate a bit smarter as it looks like your schema is such that different types of vertices have different rules for what might make it a leaf.

Given the sample graph you had in GremlinBin, I did this to get all the green vertices at the bottom of your picture above:

g.V().has('employee','id',1).
  repeat(__.in('reportsTo')).emit().
  repeat(out('has')).emit(__.not(outE('has')))

In answer to your second question you could extend the above to:

g.V().has('employee','id',1).
  repeat(__.in('reportsTo')).emit().
  repeat(out('has')).emit(__.not(outE('has'))).
  group().
    by(__.in('has')).
  select(values).
  unfold().
  order(local).
    by('id',decr).
  local(unfold().limit(1))

Basically group the leaf vertices back on their parent vertex, then pop off the values which is the list of leaves per parent. Flatten those with unfold() and order them each by the property you care about (in this case "id") and then choose the first item in that ordered list.

like image 87
stephen mallette Avatar answered Oct 23 '25 09:10

stephen mallette



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!