Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OrientDB concurrent graph operations in Java

Tags:

java

orientdb

I'm trying to use orientdb (v2.1.2) in an multithreaded environment (Java 8) where i update a vertex from within multiple threads. I'm aware that orientdb is using MVCC and thus those operations may fail and have to be executed again.

I wrote a small unit test that tries to provoke such situations by waiting on a cyclic barrier within the threads i fork. Unfortunately the test fails with an obscure Exception which i don't understand:

Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
INFO: OrientDB auto-config DISKCACHE=10,427MB (heap=3,566MB os=16,042MB disk=31,720MB)
Thread [0] running 
Thread [1] running 
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Sep 21, 2015 3:00:24 PM com.orientechnologies.common.log.OLogManager log
WARNING: {db=tinkerpop} Requested command 'create edge type 'testedge_1442840424480' as subclass of 'E'' must be executed outside active transaction: the transaction will be committed and reopen right after it. To avoid this behavior execute it outside a transaction
Exception in thread "Thread-4" com.orientechnologies.orient.core.exception.OSchemaException: Cluster with id 11 already belongs to class testedge_1442840424480
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.checkClustersAreAbsent(OSchemaShared.java:1264)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.doCreateClass(OSchemaShared.java:983)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:415)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.createClass(OSchemaShared.java:400)
    at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.createClass(OSchemaProxy.java:100)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph$6.call(OrientBaseGraph.java:1387)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph$6.call(OrientBaseGraph.java:1384)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1739)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1384)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1368)
    at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1353)
    at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:928)
    at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:832)
    at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest.lambda$0(OrientDBTinkerpopMultithreadingTest.java:31)
    at com.gentics.test.orientdb.OrientDBTinkerpopMultithreadingTest$$Lambda$1/1446001495.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:745)

The test is using a simple in-memory database. I don't get why orientdb is checking some cluster actions:

Cluster with id 11 already belongs to class testedge

Somehow this issue only appears when i try to create two edges with the same label.

private OrientGraphFactory factory = new OrientGraphFactory("memory:tinkerpop").setupPool(5, 20);

@Test
public void testConcurrentGraphModifications() throws InterruptedException {
    OrientGraph graph = factory.getTx();
    Vertex v = graph.addVertex(null);
    graph.commit();
    CyclicBarrier barrier = new CyclicBarrier(2);

    List<Thread> threads = new ArrayList<>();

    // Spawn two threads
    for (int i = 0; i < 2; i++) {
        final int threadNo = i;
        threads.add(run(() -> {
            System.out.println("Running thread [" + threadNo + "]");
            // Start a new transaction and modify vertex v
            OrientGraph tx = factory.getTx();
            Vertex v2 = tx.addVertex(null);
            v.addEdge("testedge", v2);
            try {
                barrier.await();
            } catch (Exception e) {
                e.printStackTrace();
            }
            tx.commit();
        }));
    }

    // Wait for all spawned threads
    for (Thread thread : threads) {
        thread.join();
    }
}

protected Thread run(Runnable runnable) {
    Thread thread = new Thread(runnable);
    thread.start();
    return thread;
}

In general i would be very thankful for a example that demonstrates how to deal with MVCC conflicts when using orientdb in an embedded multithreaded java environment.


Update:

I noticed that the problem no longer occures when i reload the vertex within my thread via tx.getVertex(vertex.getId()) (not via .reload()). I get various errors when i just pass the vertex object reference to my thread and use it there. I assume the OrientVertex class is not threadsafe.

like image 796
Jotschi Avatar asked Sep 26 '22 16:09

Jotschi


1 Answers

  1. You are right all graph elements are not thread safe.
  2. Reason of your exception is following when you create edge, you underneath of graph database create document with class which equals to label of the edge. If class is absent transaction is committed automatically and new class inside of schema is created. Each class is mapped to cluster in database (it is like a table) when you add edges concurrently you at the same time create the same class and as result the same cluster is created. So one thread wins other fails with exception that cluster with given name already created. Actually I suggest you to create all classes aka labels of edges if possible before you will add edges at runtime.

One more suggestion. You should think about OrientGraph instance as if it is connection to the server. The best usage is following:

  1. Setup pool in OrientGraphFactory
  2. Acquire graph instance before transaction.
  3. Execute transaction.
  4. Call .shutdown(), do not create long living graph instances.
like image 179
Andrey Lomakin Avatar answered Nov 06 '22 05:11

Andrey Lomakin