Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Create a Node with Neo4jClient in Neo4j v2?

Under Neo4j v1.9.x, I used the following sort of code.

private Category CreateNodeCategory(Category cat)
{
        var node = client.Create(cat,
            new IRelationshipAllowingParticipantNode<Category>[0],
            new[]
            {
                new IndexEntry(NeoConst.IDX_Category)
                {
                    { NeoConst.PRP_Name, cat.Name },
                    { NeoConst.PRP_Guid, cat.Nguid.ToString() }
                }
            });
        cat.Nid = node.Id;
        client.Update<Category>(node, cat);
        return cat;
}

The reason being that the Node Id was auto generated and I could use it later for a quick look up, start bits in other queries, etc. Like the following:

    private Node<Category> CategoryGet(long nodeId)
    {
        return client.Get<Category>((NodeReference<Category>)nodeId);
    }

This enables the following which appeared to work well.

    public Category CategoryAdd(Category cat)
    {
        cat = CategoryFind(cat);
        if (cat.Nid != 0) { return cat; }
        return CreateNodeCategory(cat);
    }

    public Category CategoryFind(Category cat)
    {
        if (cat.Nid != 0) { return cat; }
        var node = client.Cypher.Start(new { 
    n = Node.ByIndexLookup(NeoConst.IDX_Category, NeoConst.PRP_Name, cat.Name)})
            .Return<Node<Category>>("n")
            .Results.FirstOrDefault();
        if (node != null) { cat = node.Data; }
        return cat;
    }

Now the cypher Wiki, examples and bad-habits recommend using the .ExecuteWithoutResults() in all the CRUD.

So the question I have is how do you have an Auto Increment value for the node ID?

like image 799
Cheval Avatar asked Oct 23 '13 06:10

Cheval


1 Answers

First up, for Neo4j 2 and onwards, you always need to start with the frame of reference "how would I do this in Cypher?". Then, and only then, do you worry about the C#.

Now, distilling your question, it sounds like your primary goal is to create a node, and then return a reference to it for further work.

You can do this in cypher with:

CREATE (myNode)
RETURN myNode

In C#, this would be:

var categoryNode = graphClient.Cypher
    .Create("(category {cat})")
    .WithParams(new { cat })
    .Return(cat => cat.Node<Category>())
    .Results
    .Single();

However, this still isn't 100% what you were doing in your original CreateNodeCategory method. You are creating the node in the DB, getting Neo4j's internal identifier for it, then saving that identifier back into the same node. Basically, you're using Neo4j to generate auto-incrementing numbers for you. That's functional, but not really a good approach. I'll explain more ...

First up, the concept of Neo4j even giving you the node id back is going away. It's an internal identifier that actually happens to be a file offset on disk. It can change. It is low level. If you think about SQL for a second, do you use a SQL query to get the file byte offset of a row, then reference that for future updates? A: No; you write a query that finds and manipulates the row all in one hit.

Now, I notice that you already have an Nguid property on the nodes. Why can't you use that as the id? Or if the name is always unique, use that? (Domain relevant ids are always preferable to magic numbers.) If neither are appropriate, you might want to look at a project like SnowMaker to help you out.

Next, we need to look at indexing. The type of indexing that you're using is referred to in the 2.0 docs as "Legacy Indexing" and misses out on some of the cool Neo4j 2.0 features.

For the rest of this answer, I'm going to assume your Category class looks like this:

public class Category
{
    public Guid UniqueId { get; set; }
    public string Name { get; set; }
}

Let's start by creating our category node with a label:

var category = new Category { UnqiueId = Guid.NewGuid(), Name = "Spanners" };
graphClient.Cypher
    .Create("(category:Category {category})")
    .WithParams(new { category })
    .ExecuteWithoutResults();

And, as a one-time operation, let's establish a schema-based index on the Name property of any nodes with the Category label:

graphClient.Cypher
    .Create("INDEX ON :Category(Name)")
    .ExecuteWithoutResults();

Now, we don't need to worry about manually keeping indexes up to date.

We can also introduce an index and unique constraint on UniqueId:

graphClient.Cypher
    .Create("CONSTRAINT ON (category:Category) ASSERT category.UniqueId IS UNIQUE")
    .ExecuteWithoutResults();

Querying is now very easy:

graphClient.Cypher
    .Match("(c:Category)")
    .Where((Category c) => c.UniqueId == someGuidVariable)
    .Return(c => c.As<Category>())
    .Results
    .Single();

Rather than looking up a category node, to then do another query, just do it all in one go:

var productsInCategory = graphClient.Cypher
    .Match("(c:Category)<-[:IN_CATEGORY]-(p:Product)")
    .Where((Category c) => c.UniqueId == someGuidVariable)
    .Return(p => p.As<Product>())
    .Results;

If you want to update a category, do that in one go as well:

graphClient.Cypher
    .Match("(c:Category)")
    .Where((Category c) => c.UniqueId == someGuidVariable)
    .Update("c = {category}")
    .WithParams(new { category })
    .ExecuteWithoutResults();

Finally, your CategoryAdd method currently 1) does one DB hit to find an existing node, 2) a second DB hit to create a new one, 3) a third DB hit to update the ID on it. Instead, you can compress all of this to a single call too using the MERGE keyword:

public Category GetOrCreateCategoryByName(string name)
{
    return graphClient.Cypher
        .WithParams(new {
            name,
            newIdIfRequired = Guid.NewGuid()
        })
        .Merge("(c:Category { Name = {name})")
        .OnCreate("c")
        .Set("c.UniqueId = {newIdIfRequired}")
        .Return(c => c.As<Category>())
        .Results
        .Single();
}

Basically,

  1. Don't use Neo4j's internal ids as a way to hack around managing your own identities. (But they may release some form of autonumbering in the future. Even if they do, domain identities like email addresses or SKUs or airport codes or ... are preferred. You don't even always need an id: you can often infer a node based on its position in the graph.)

  2. Generally, Node<T> will disappear over time. If you use it now, you're just accruing legacy code.

  3. Look into labels and schema-based indexing. They will make your life easier.

  4. Try and do things in the one query. It will be much faster.

Hope that helps!

like image 92
Tatham Oddie Avatar answered Sep 19 '22 11:09

Tatham Oddie