Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neo4j over bolt protocol has very high latency

I'm using Neo4j for a project using the official Neo4j driver for .NET found here:

https://www.nuget.org/packages/Neo4j.Driver

This driver works over the bolt protocol, my assumption being that a specialized binary protocol would be more efficient than the HTTP API. But ever since the start of the project, I've noticed relatively high latencies from Neo4j for even very simple operations. Such as a match like the following taking 30-60ms when UserID is an indexed field and the database otherwise being completely empty:

match(n:User { UserID: 1 }) return n.UserID

This behavior occurs both on my local machine (near zero network overhead) and our production environment. I started investigating this today and found that the query returns quickly, but it takes a long time to actually stream in the results. For example, the below query takes 0.2ms before the call returns on localhost, but then calling ToArray() on result (buffering the records, which in this case is a single integer field) increases the time to 60ms.

using (var driver = GraphDatabase.Driver($"bolt://localhost:7687", AuthTokens.Basic("neo4j", "1")))
{    
    using (var session = driver.Session())
    {
        // 0.2ms to return from this call
        var result = session.Run("match(n:User { ID: 1}) return n.ID"); 

        // Uncommenting this makes the whole thing take 60ms
        // result.ToArray(); 
    }
}

I then tried the community sponsored Neo4jClient package, which works over HTTP:

https://github.com/Readify/Neo4jClient

With the same query, the total time is reduced to just 0.5ms:

var client = new GraphClient(new Uri("http://localhost:7474/db/data"), "neo4j", "1");
client.Connect();

client.Cypher.Match("(n:User { ID: 1})").Return<int>("n.ID").Results.ToArray();

Running a more official benchmark gives the following results, a huge difference between the bolt-driven official driver and the HTTP based Neo4jClient.

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4770 CPU 3.40GHz, ProcessorCount=8
Frequency=3312642 ticks, Resolution=301.8739 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1586.0

Type=Neo4jBenchmarks  Mode=Throughput  Platform=X64  
Jit=RyuJit  

      Method |         Median |      StdDev | Scaled | Scaled-SD |
------------- |--------------- |------------ |------- |---------- |
  Neo4jClient |    382.5675 us |   3.3771 us |   1.00 |      0.00 |
Neo4jSession | 61,299.9382 us | 690.1626 us | 160.02 |      2.24 |

So the HTTP client is 160x faster when network overhead is negligible.

I also ran the benchmark on our production environment and while the difference wasn't as large, the HTTP method was still 6x faster (and my network connection to production is pretty slow).

The full benchmark code:

public class Neo4jBenchmarks
{
    private readonly IDriver _driver;
    private readonly GraphClient _client;

    public Neo4jBenchmarks()
    {
      _driver = GraphDatabase.Driver("bolt://localhost:7687", AuthTokens.Basic("neo4j", "1"));
      _client = new GraphClient(new Uri("http://localhost:7474/db/data"), "neo4j", "1");
      _client.Connect();
    }

    [Benchmark(Baseline = true)]
    public void Neo4jClient()
    {
      _client.Cypher.Match("(n:User { ID: 1})").Return<int>("n.ID").Results.ToArray();
    }

    [Benchmark]
    public void Neo4jSession()
    {
      using (var session = _driver.Session())
      {
        session.Run("match(n:User { ID: 1}) return n.ID").ToArray();
      }
    }
}

Both my machine and production is running Neo4j CE 3.0.4 (currently the community edition), though I'm running it on Windows 10 and production is a Linux machine. We haven't tweaked any settings to my knowledge, but I doubt that could explain a 160x performance difference.

I also tried reusing the session object (which I think is a very bad idea since it isn't thread-safe) because creating a session involves creating a transaction, to see if that made a difference, but it wasn't noticeable.

I wish I could use the Neo4jClient, but we really need the ability to execute arbitrary string queries, while the Neo4jClient relies heavily on a fluent API and while it offers a low-level string mode, it's deprecated and actively discouraged in the documentation.

like image 443
JulianR Avatar asked Oct 14 '16 19:10

JulianR


1 Answers

After further digging, I traced the problem to the Neo4j.Driver package specifically, as the driver for NodeJS didn't suffer from the same issue.

Cloning the current source of the package, building it and referencing the DLL directly instead of the NuGet package eliminated the problem entirely. To put into perspective: the current version that is on NuGet (1.0.2) takes 62 seconds to do 1000 simple match requests against localhost, whereas the current source does so in 0.3 seconds (even beating the NodeJS driver by a factor of 10).

I'm not quite sure why, but I'm pretty sure it has something to do with the rda.SocketsForPCL dependency of the current package, which appears to be a glue library to make sockets work cross-platform. However, the current source references the System.Net.Sockets package for that.

So in conclusion, this issue can be worked around by referencing a current build of the source and will be resolved entirely when a new version of the package is released.

like image 72
JulianR Avatar answered Sep 20 '22 11:09

JulianR