Significant performance difference between neo4j direct access and via OGM

Tags:

I am evaluating the performance of Neo4j graph database with a simple benchmark for insert, update, delete and query. Using Neo4j OGM I see significantly slower execution times (about 2-4 times) compared to the direct access via Neo4j driver. For example, delete operation (see code below) is done in 500ms vs 1200ms for 10K nodes and 11K relations on my machine. I wonder why this happens, especially because the below code for deletion doesn't even use any node entity. I can imagine that OGM has some overhead but this seems to be too much. Anyone has an idea why it's slower?

Example node:

public abstract class AbstractBaseNode {

    @GraphId
    @Index(unique = true)
    private Long id;

    public Long getId() {
        return id;
    }
}

@NodeEntity
public class Company extends AbstractBaseNode {

    private String name;

    public Company(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

Example code for delete via driver:

driver = GraphDatabase.driver( "bolt://localhost:7687", AuthTokens.basic( "neo4j", "secret" ) );
session = driver.session();

long start = System.nanoTime();

session.run("MATCH (n) DETACH DELETE n").list();

System.out.println("Deleted all nodes " + ((System.nanoTime() - start) / 1000) + "μs");

Example code for delete via OGM:

public org.neo4j.ogm.config.Configuration neo4jConfiguration() {
    org.neo4j.ogm.config.Configuration config =  new org.neo4j.ogm.config.Configuration();
    config.autoIndexConfiguration().setAutoIndex(AutoIndexMode.DUMP.getName());
    config.driverConfiguration()
            .setDriverClassName("org.neo4j.ogm.drivers.bolt.driver.BoltDriver")
            .setURI("bolt://neo4j:secret@localhost")
            .setConnectionPoolSize(10);

    return config;
}

sessionFactory = new SessionFactory(neo4jConfiguration(), "net.mypackage");
session = sessionFactory.openSession();

long start = System.nanoTime();

session.query("MATCH (n) DETACH DELETE n", Collections.emptyMap()).forEach(x -> {});

System.out.println("Deleted all nodes " + ((System.nanoTime() - start) / 1000) + "μs");

590

asked May 11 '17 09:05

Steffen Harbich

1 Answers

I will start by pointing out your test samples are poor. When taking time sample, you want to stress the system so that it takes a fair amount of time. The tests should also test what your interested in (are you testing how fast you can create and drop connections? Max Cypher through put? Speed of single large transaction?) With tests that are barley a second, it is impossible to tell if difference in performance is the query call, or just startup overhead (despite the name, the session doesn't actually connect until you call query(...)).

As far as I can tell, both version perform about the same in a normal setup. The only thing I can think of that would affect this is if the OSGM was doing something to starve other processes of system resources.

UPDATE

UNWIND {rows} as row 
CREATE (n:Company) 
SET n=row.props 
RETURN row.nodeRef as ref, ID(n) as id, row.type as type with params {rows=[{nodeRef=-1206180304, type=node, props={name=company_1029}}]}

CREATE (a:Company {name: {name}}) // X10,000

The key difference between the driver and the OGM is that the driver does exactly what you tell it to do, which is the most efficient way of doing things; and the OGM tries to manage the query logic for you (What to return, how to save things, what to try to save). And the OGM version is more reliable because it will automatically try to consolidate nodes to the database (if possible), and will only save things that have actually changed. Since your node class doesn't have a primary key to consolidate on, it will have to create everything. The OGM Cypher is more versatile, but it also requires more memory use/access. SET n.name="rawr" is 1 db hit per property. SET n={name:"rawr"} is 3 db hits though (about 1+2*#_of_props. {name:"rawr", id:2} is 5 db hits). That is why the OGM Cypher is slower. The OGM however has smart management though, so if you than edit one node and try to save it, the driver would have to either save all, or you would have to implement your own manager. The OGM will only save the updated one.

So in short, the OGM Cyphers are less efficient than what you would write using the driver, but the OGM has smart management built in that can make it faster than a blind driver implementation in real business logic situations (loading/editing large numbers of nodes). Of course, you can implement your own management with the driver to be faster, so it's a trade off of speed and development effort. The more speed you want, the more time you have to put into managing every tiny aspect (and the point of OGM is to plug it in and it just works).

171

answered Nov 08 '22 17:11

Tezra

Related questions
                            
                                build.gradle with Jacoco plugin doesn't generate coverage report for integration tests
                            
                                File open in excel version 2013 but not on 2016?
                            
                                Apache Ivy. Transitive dependencies not retrieved
                            
                                Adding elements to Java 8 parallel Streams on-the-fly
                            
                                Servlet.service() for servlet jsp threw exception java.lang.IllegalStateException
                            
                                Why I get this error when on google map "Failed to load DynamiteLoader: java.lang.ClassNotFoundException: Didn't find class?
                            
                                Android Vision Barcode API - read custom QR code with binary data
                            
                                Is there a scenario where Java7's Hashmap implementation is preferred to Java8's implementation
                            
                                @WithMockUser doesn't work in Integration Test - Spring boot
                            
                                Catching exception thrown in AuthenticationProvider
                            
                                Getting touch coordinates not accurate in ImageView FloodFill Algorithm
                            
                                Translate jardesc content into jar-command
                            
                                How to retransform an executing method with JVMTI agent which has no further invocations?
                            
                                How to Post JSON array using Retrofit 2
                            
                                How to use memoization in counting a large number of matrices
                            
                                How to log actual repository class names using spring AOP instead of proxy?
                            
                                Style lost when combining workbooks with aspose
                            
                                Consul and Spring Boot services in Docker - not deregistering
                            
                                Microservices with gRPC and REST using Spring Boot
                            
                                How to receive phone call events

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Significant performance difference between neo4j direct access and via OGM

Tags:

java

neo4j

neo4j-ogm

Steffen Harbich

People also ask

1 Answers

Tezra

Recent Activity

Donate For Us