My objective is to write a shortest path algorithm for a road network.
Currently my architecture is something like that: I store all the data in the PostGIS enabled PostgreSQL database. I do one SELECT * FROM ways
, which takes less than 3 seconds on a table with 100,000 edges (ways) and after that I will apply a (Java, Ruby or anything-based) shortest path algorithm to the graph that already resides in memory. The second operation can take about 1.5 seconds on a graph with 100,000 edges.
So, it takes:
This is very similar to what pgRouting does (to my knowledge it uses C Boost to store the graph in memory), except pgRouting takes about 2 seconds in total to compute a shortest path on the same data set (yes, it is fast, but it is a black box for me, so I need my own).
But recently I found about Graph databases and about Neo4j. On their site they claim that "Still being able to do these calculations in sub-second speeds on graphs of millions of roads and waypoints makes it possible in many cases to abandon the normal approach of precomputing indexes with K/V stores and be able to put routing into the critical path with the possibility to adapt to the live conditions and build highly personalized and dynamic spatial services.".
So the question is: Will a graph database be faster with my particular problem?
The problem has the following properties:
You certainly dont have to reinvent the wheel if you are using any graph database, like Neo4j. Many shortest path algorithms are built into this and it's designed to handle complexity in case you have to consider speed limitation in any specific road, one-way road, score of a road etc. How do you keep up with performance when your data grows 10 times, or, 100 times. Considering your total computation time 3sec for 100,000 ways, it can be in minutes for 1M ways and in Neo4j, the response will be in milli sec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With