Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GremlinPipeLine java API chain traversal in Titan graph use cases

I have use case where in I have to traverse a chain of vertices starting from a particular vertex. Its a linear chain (like a train) with only one vertex connected to the previous.While traversing I have to emit certain vertices based on some criteria until I reach the end of the chain.

The second use case is the extension of the above use case but instead of a single chain starting from a single vertex, there are multiple such chains , again starting from a single vertex. I have to traverse each chain and check for a particular property value in the vertices. When that property match is found I have to emit that vertex and the start with second chain and so on.

I have to achieve that using Gremlin java API. This appears to be a simple one, but I am new to gremlin and there is not much help from the internet either on gremlin java API.

like image 673
user3244615 Avatar asked Dec 19 '22 05:12

user3244615


1 Answers

Converting Gremlin Groovy to Gremlin Java shouldn't be very difficult. I would always argue against doing it as you will:

  1. Greatly increase the size of your code
  2. Make your code less readable
  3. Make your code harder to maintain

If you work in a "Java shop" that won't hear of an outside programming language, I think that it's not too hard to sell folks on those points with just a few examples of the differences Gremlin has in groovy and java (easy to read one liners vs. what could be hundreds of lines of code). Furthermore, Groovy can fit into a standard Maven project either alongside java in the same module or in a separate standalone module that other projects depend on. In most cases, I prefer the latter as you isolate your groovy in a single package and that becomes reusable as a DSL across multiple use cases (e.g. an application, a add-on lib in the gremlin console, etc.).

That said, if you still must use Java, I would still start by writing Groovy. Use the Gremlin Console and get your traversal algorithm right. It sounds as though both of your use cases involve looping, so we'll just say that your traversal looks something like:

g.v(1).out.loop(1){true}{it.object.someProperty=="emitIfThis"}

So that would traverse the chain from vertex "1" until I exhaust the chain, signified by "true" in the first closure, and then emit any vertex that matches my criteria in the second closure. Once you have that much of your Gremlin defined and tested, it's time to convert to Java.

As you know that starts with a GremlinPipeline and the first part is pretty easy for conversion purposes:

new GremlinPipeline(g.getVertex(1)).out()

As you can see, the Groovy approach will pretty much map to Java fairly cleanly until you get to a point where you need a closure and loop is one of those steps that requires one. To work with Gremlin Java you will probably find it useful to look at the javadoc for GremlinPipeline.

I used the three argument version of loop - the one marked "deprecated" (but that's ok for our purposes) - you can see it here. The first argument is simple - an integer so the first part of the translation is:

new GremlinPipeline(g.getVertex(1)).out().loop(1, closure, closure)

I've left place holders for the two other closures that we have. If you look at it this way, it's really not that different from our Groovy version - ever so slightly different syntax.

Prior to Java 8 there was no notion of closures built into the java langauge. Note that in TinkerPop3, Gremlin has changed dramatically to take advantage of the fact that we now have lambdas. But as you are in TinkerPop2, you have to use the built in PipeFunction which essentially represents typed versions of our groovy closures. The PipeFunction for both arguments to loop is:

PipeFunction<LoopPipe.LoopBundle<E>,Boolean>

So basically, this is a function that gets a LoopPipe.LoopBundle as an object which contains metadata about the loop and expects that you return a boolean value. If you understand that concept, then all of Gremlin Java opens up for you, because everywhere that you see a groovy closure, you know that underneath it is just some form of PipeFunction in java and given that you can now read the expectations of a PipeFunction from the javadocs, it should be straightforward to do these language translations.

The first closure translation we have to do is as straightforward as it come - we just need our PipeFunction to return true:

new GremlinPipeline(g.getVertex(1)).out().loop(1, 
    new PipeFunction<LoopPipe.LoopBundle<Vertex>,Boolean>() {
        public Boolean compute(LoopPipe.LoopBundle<Vertex> argument) {
            return true;
        }
    }, closure)

So, for the second argument to loop we have to construct a new PipeFunction, which has one method called compute. From that method we return true. Now to handle the second PipeFunction argument that controls the vertices to emit:

new GremlinPipeline(g.getVertex(1)).out().loop(1, 
    new PipeFunction<LoopPipe.LoopBundle<Vertex>,Boolean>() {
        public Boolean compute(LoopPipe.LoopBundle<Vertex> argument) {
            return true;
        }
    }, 
    new PipeFunction<LoopPipe.LoopBundle<Vertex>,Boolean>() {
        public Boolean compute(LoopPipe.LoopBundle<Vertex> argument) {
            return argument.getObject().getProperty("someProperty").equals("emitIfThis");
        }
    })

And there stands the conversion. As this is a long post, let's place the original groovy closer to the above so that the difference are clear:

g.v(1).out.loop(1){true}{it.object.someProperty=="emitIfThis"}

We went from the above one line of code to nearly a full dozen on what was an otherwise very simple traversal. Gremlin Java comes into its own in TinkerPop3 given lambdas and a major overhaul of the language itself, but these prior versions produce java code that really isn't worth the effort generating or maintaining when Groovy can make things very neat and tidy.

like image 199
stephen mallette Avatar answered Dec 21 '22 23:12

stephen mallette