Can anybody please provide or point out to a good comparison between Neo4j and Titan? One thing i can see is in terms of scale - Titan is scaleout and requires an underlying scalable datastore like cassandra. Neo4j is only for HA and has its own embedded database. Any other pros and cons? Any specific usecases. (Is Titan being used anywhere currently?)
I also have the following link: http://architects.dzone.com/articles/16-graph-databases-compared that gives a objective compare for graph databases but not much on pros and cons between Neo4j and Titan.
Neo4j has some upper bound limit for the graph size and can support tens of billions of nodes, properties, and relationships in a single graph. No security is provided at the data level and there is no data encryption. Security auditing is not available in Neo4j.
Graph databases are not as useful for operational use cases because they are not efficient at processing high volumes of transactions and they are not good at handling queries that span the entire database.
Every time you visit LinkedIn and see first-, second-, or third-degree connections, you're getting results from the social networking site's professional network graph built on a graph database. Facebook, Instagram and Twitter all use graph databases and analytics to understand how users relate to each other and ...
We have a social graph in which in a day we add almost 1 millions of node and twice as many edges. We started with neo4j graph because yes, it is very fast due to fact that its storage is on the same machine on which graph engine runs. But following are the experiences that we would like to share with you about neo4j.
Ultimately we are so much fade up with server performance with gremlin query that we had to change the database to titan.
On titan we are getting reasonable performance and also scaling is very easy as we are using cassandra as backend storage. But mind you that .. using gremlin here also not a good idea as multiget query is very ugly to write and without multiget its query becomes very slow.
Great to see you exploring graph databases. I will speak to the Neo4j part of your question:
More than 30 of the Global 2000 now use Neo4j in production for a wide range of use cases, many of them surprising, even to us! (And we invented the property graph!)
A partial list of customers can be found below: www.neotechnology.com/customers
Neo4j has been in 24x7 production for 10 years, and while the product has of course evolved significantly since then, it's built on a very solid foundation.
Most the companies moving to graph databases--speaking for Neo4j, which is what I know about-- are doing so because either a) their RDBMSs weren't able to handle the scope & scale of their connected query requirements, and/or b) the immense convenience and speed that comes from modeling domains that are a graph (social, network & data center management, fraud, portfolios, identity, etc.) as a graph, not as tables.
For kicks, you can find a number of customer talks here, from the four (soon five) GraphConnect conferences that were held this year in major cities around the world:
http://watch.neo4j.org/
If you're in London, the last one will be held next week: http://www.graphconnect.com
You'll find a summary below of some of the technology behind Neo4j, with some customer examples. To speak very directly to your question about scaling: Neo4j has a unique architecture designed to maximize query response time & query predictability, by allowing horizontal scale-out in such a way that each instance can access the graph without having to hop over the network. (Need more read throughput. Just add instances.) It turns out that this approach works well for 95+% of the graphs out there, including some production customers who have more than half of the Facebook social graph running in a single Neo4j cluster, backing an "always on" 24x7 web site.
www.neotechnology.com/neo4j-scales-for-the-enterprise/
One of the world's largest postal delivery services does all of their real-time package routing with Neo4j. Railroads are building routing systems on Neo4j. Some of the world's largest customers are using them for HR and data governance, alternate-path routing, network & data center management, real-time fraud detection, bioinformatics, etc.
Neo4j's Cypher query language is the only declarative query language built expressly for property graphs. It takes all of the lessons learned from our 13-year old native Java API (which was the basis for Blueprints, which some of the other graph databases have since adopted) and rolls them into a next-generation language. Cypher is a great way to learn graphs, and to develop applications; and there's always the native Java API if you have special needs or value "bare metal" performance (i.e. sub millisecond vs. single-digit millisecond) performance above convenience. Neo4j is built from the ground up to support graphs, and has a graph storage engine that is built to store graphs; unlike some of the more recent additions to the graph database ecosystem, which are architected as graph libraries on top of non-graph databases, and are subject to some of the inherent limitations. (e.g. FlockDB, because it is based on MySQL, will still be very slow for anything greater than one hop.)
Definitely feel free to contact the Neo team if you need anything more specific. We'll be more than happy to help you! http://info.neotechnology.com/ContactUs.html
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With