Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use Neo4j-embedded for Python (threads) in Flask microframework?

I'm following the Flask Tutorial (Flaskr) in order to experiment with using Neo4j-embedded for Python. This is in a virtualenv. Here is my 'main' app code:

import os
import jpype
from neo4j import GraphDatabase
from flask import Flask, request, session, g, redirect, url_for, abort, render_template, flash

app = Flask(__name__)
app.config.from_pyfile(os.environ['APP_SETTINGS'])


def connectDB(): 
    return GraphDatabase(app.config['DATABASE'])


def initDB():
    db = connectDB()

    with db.transaction:
        users = db.node()
        roles = db.node()

        db.reference_node.USERS(users)
        db.reference_node.ROLES(roles)

        userIndex = db.node.indexes.create('users')

        user = db.node(name=app.config['ADMIN'])
        user.INSTANCE_OF(users)
        userIndex['name'][app.config['ADMIN']] = user

        role = db.node(type='superadmin')
        role.INSTANCE_OF(roles)

        role.ASSIGN_TO(user)

    db.shutdown()

    print "Database initialized."


def testDB():
    db = connectDB()

    with db.transaction:
        userIndex = db.node.indexes.get('users')
        user = userIndex['name'][app.config['ADMIN']].single
        username = user['name']

    db.shutdown()

    print "Admin username is '%s'. Database exists." % username


@app.before_request
def before_request():
    jpype.attachThreadToJVM()
    g.db = connectDB()


@app.teardown_request
def teardown_request(exception):
    g.db.shutdown()


@app.route('/')
def index():

    with g.db.transaction:
        userIndex = g.db.node.indexes.get('users')
        user = userIndex['name'][app.config['ADMIN']].single
        username = user['name']

    fields = dict(username=username)
    return render_template('index.html', fields=fields)


if os.path.exists(app.config['DATABASE']) == False:
    initDB()
else:
    testDB()

initDB() and testDB() work perfectly fine - without Gremlin, PyLucene, etc. - just jpype and neo4j-embedded. Initially, the JVM would fail and the app would terminate when I would request index(). I scoured the net to learn that I needed to add the line "jpype.attachThreadToJVM()" into before_request() to solve that issue with python threading the JVM and the app does not terminate. However, this leads immediately to another issue:

Traceback (most recent call last):
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1518, in __call__
    return self.wsgi_app(environ, start_response)
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1506, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1504, in wsgi_app
    response = self.full_dispatch_request()
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1264, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1262, in full_dispatch_request
    rv = self.dispatch_request()
  File "/ht/dev/envFlask/lib/python2.7/site-packages/flask/app.py", line 1248, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/ht/dev/apps/evobox/evobox/__init__.py", line 68, in index
    userIndex = g.db.node.indexes.get('users')
  File "/ht/dev/envFlask/lib/python2.7/site-packages/neo4j/index.py", line 36, in get
    return self._index.forNodes(name)
java.lang.RuntimeExceptionPyRaisable: java.lang.IllegalArgumentException: No index provider 'lucene' found. Maybe the intended provider (or one more of its dependencies) aren't on the classpath or it failed to load.

Google search of the entire last line didn't go anywhere. Just searching "java.lang.IllegalArgumentException: No index provider 'lucene' found." lead to nothing in the context of python.

The neo4j messages.log seems to show the database was opened 3 times (initDB(), testDB(), and index()). The classpath is the same for each instance:

Class Path: /ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-jmx-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-lucene-index-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-graph-matching-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-kernel-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/geronimo-jta_1.1_spec-1.1.1.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/lucene-core-3.1.0.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-graph-algo-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-udc-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/neo4j-cypher-1.5.M02.jar:/ht/dev/envFlask/local/lib/python2.7/site-packages/neo4j/javalib/scala-library-2.9.0-1.jar

I also modified index() to connectDB and attachThreadToJVM directly like initDB() and testDB() without using the 'g' global - this resulted in exact same error.

What am I possibly missing/overlooking to get neo4j-embedded and jpype working on a threaded request and not just in the 'main' app?

Note: I'm aware of the RESTful Web Service solutions with py2neo or Rexster/Bulbs, but I want avoid that for now.

EDIT: Using JPype-0.5.4.2, Neo4j-embedded-1.5.b2, Java-6-openjdk

like image 534
Robert Samurai Avatar asked Jan 19 '23 08:01

Robert Samurai


1 Answers

One issue with this pattern is that you run the risk of starting multiple databases pointed at the same location, which will lead to problems. What you would want is a single instance of the database that follows the full lifecycle of your application.

Why it's not finding the lucene provider is a harder question.. The index provider is loaded using java service loader, which means that JPype should not affect it. As long as the JVM start alright, and the lucene-index implementation jar is on the classpath, it should work.

It may be related to threading somehow, I'm currently writing a fix to automaitcally handle the "attachThreadToJVM()" calls. I'll add a test case to make sure reading indexes from a separate thread works as expected as well.

The threading work is currently kept updated in this mailing list thread:

http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Python-embedding-problems-with-shutdown-and-threads-td3476163.html

like image 162
Jacob Davis-Hansson Avatar answered Jan 30 '23 00:01

Jacob Davis-Hansson