Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A* Algorithm for very large graphs, any thoughts on caching shortcuts?

I'm writing a courier/logistics simulation on OpenStreetMap maps and have realised that the basic A* algorithm as pictured below is not going to be fast enough for large maps (like Greater London).

http://i.imgur.com/u2tVpML.jpg

The green nodes correspond to ones that were put in the open set/priority queue and due to the huge number (the whole map is something like 1-2 million), it takes 5 seconds or so to find the route pictured. Unfortunately 100ms per route is about my absolute limit.

Currently, the nodes are stored in both an adjacency list and also a spatial 100x100 2D array.

I'm looking for methods where I can trade off preprocessing time, space and if needed optimality of the route, for faster queries. The straight-line Haversine formula for the heuristic cost is the most expensive function according to the profiler - I have optimised my basic A* as much as I can.

For example, I was thinking if I chose an arbitrary node X from each quadrant of the 2D array and run A* between each, I can store the routes to disk for subsequent simulations. When querying, I can run A* search only in the quadrants, to get between the precomputed route and the X.

Is there a more refined version of what I've described above or perhaps a different method I should pursue. Many thanks!

For the record, here are some benchmark results for arbitrarily weighting the heuristic cost and computing the path between 10 pairs of randomly picked nodes:

Weight // AvgDist% // Time (ms) 1       1       1461.2 1.05    1       1327.2 1.1     1       900.7 1.2     1.019658848     196.4 1.3     1.027619169     53.6 1.4     1.044714394     33.6 1.5     1.063963413     25.5 1.6     1.071694171     24.1 1.7     1.084093229     24.3 1.8     1.092208509     22 1.9     1.109188175     22.5 2       1.122856792     18.2 2.2     1.131574742     16.9 2.4     1.139104895     15.4 2.6     1.140021962     16 2.8     1.14088128      15.5 3       1.156303676     16 4       1.20256964      13 5       1.19610861      12.9 

Surprisingly increasing the coefficient to 1.1 almost halved the execution time whilst keeping the same route.

like image 244
drspa44 Avatar asked Apr 15 '15 17:04

drspa44


People also ask

Is A * The best pathfinding algorithm?

A* is the most popular choice for pathfinding, because it's fairly flexible and can be used in a wide range of contexts. A* is like Dijkstra's Algorithm in that it can be used to find a shortest path. A* is like Greedy Best-First-Search in that it can use a heuristic to guide itself.

How is A * better than Dijkstra?

A* algorithm is just like Dijkstra's algorithm, and the only difference is that A* tries to look for a better path by using a heuristic function, which gives priority to nodes that are supposed to be better than others while Dijkstra's just explore all possible ways.

What is the fastest pathfinding algorithm?

Dijkstra's algorithm is used for our fastest path algorithm because it can find the shortest path between vertices in the graph. The coordinates on the arena are considered as the vertices in the graph.


2 Answers

You should be able to make it much faster by trading off optimality. See Admissibility and optimality on wikipedia.

The idea is to use an epsilon value which will lead to a solution no worse than 1 + epsilon times the optimal path, but which will cause fewer nodes to be considered by the algorithm. Note that this does not mean that the returned solution will always be 1 + epsilon times the optimal path. This is just the worst case. I don't know exactly how it would behave in practice for your problem, but I think it is worth exploring.

You are given a number of algorithms that rely on this idea on wikipedia. I believe this is your best bet to improve the algorithm and that it has the potential to run in your time limit while still returning good paths.

Since your algorithm does deal with millions of nodes in 5 seconds, I assume you also use binary heaps for the implementation, correct? If you implemented them manually, make sure they are implemented as simple arrays and that they are binary heaps.

like image 90
IVlad Avatar answered Sep 21 '22 14:09

IVlad


There are specialist algorithms for this problem that do a lot of pre-computation. From memory, the pre-computation adds information to the graph that A* uses to produce a much more accurate heuristic than straight line distance. Wikipedia gives the names of a number of methods at http://en.wikipedia.org/wiki/Shortest_path_problem#Road_networks and says that Hub Labelling is the leader. A quick search on this turns up http://research.microsoft.com/pubs/142356/HL-TR.pdf. An older one, using A*, is at http://research.microsoft.com/pubs/64505/goldberg-sp-wea07.pdf.

Do you really need to use Haversine? To cover London, I would have thought you could have assumed a flat earth and used Pythagoras, or stored the length of each link in the graph.

like image 39
mcdowella Avatar answered Sep 21 '22 14:09

mcdowella