Is there a way to use Lucene to work with graph data?
One user has a relationship with many lucene documents (Document Connections) One User has a relationship with other Users (User Connections [Graph])
If a user searches the Index, he gets back the documents that he has a relationship with. This is simple and straightforward.
What would be a way to get back the documents that the User Connections have a relationship with.
Indexing each document with all the user's that have a relationship with it in a user_id field is an approach. However when you query the index providing the User Connections for the user performing the search query size is unpredictable. Think of Users that have 1000's of User Connections. This will not scale.
It's almost like the User Connections and User Documents stored in a Graph DB can easily provide us the documents to search against but what is an effective way to communicate that to Lucene so it can only search against those documents for the given query. If any results are returned, this will guarantee that at least one or more of the User Connections has a relationship with the documents returned in the results.
I don't believe there is currently any graph technology that sits on top of solr or lucene.
You would probably be best looking at either one of these two camps:
OR
These databases are graph databases. Tinkerpop Blueprints is a standard that allows you to abstract the specific implementation. Springdata currently only supports neo4j for graph technologies.
Neo4j costs money if you cluster (free license is single instance only).
You can read discussion on solr/lucene with graphing here. http://lucene.472066.n3.nabble.com/indexing-directed-graph-td2949556.html
Note neo4j supports full text search.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With