Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dijkstra-based algorithm optimization for caching

I need to find the optimal path connecting two planar points. I'm given a function that determines the maximal advance velocity, which depends on both the location and the time.

My solution is based on Dijkstra algorithm. At first I cover the plane by a 2D lattice, considering only discrete points. Points are connected to their neighbors up to specified order, to get sufficient direction resolution. Then I find the best path by (sort of) Dijkstra algorithm. Next I improve the resolution/quality of the found path. I increase the lattice density and neighbor connectivity order, together with restricting the search only to the points close enough to the already-found path. This may be repeated until the needed resolution is achieved.

This works generally well, but nevertheless I'd like to improve the overall algorithm performance. I've implemented several tricks, such as variable lattice density and neighbor connectivity order, based on "smoothness" of the price function. However I believe there's a potential to improve Dijkstra algorithm itself (applicable to my specific graph), which I couldn't fully realize yet.

First let's agree on the terminology. I split all the lattice points into 3 categories:

  • cold - points that have not been reached by the algorithm.
  • warm - points that are reached, but not fully processed yet (i.e. have potential for improvement)
  • stable - points that are fully processed.

At each step Dijkstra algorithm picks the "cheapest" warm lattice point, and then tries to improve the price of its neighbors. Because of the nature of my graph, I get a kind of a cloud of stable points, surrounded by a thin layer of warm points. At each step a warm point at the cloud perimeter is processed, then it's added to the stable cloud, and the warm perimeter is (potentially) expanded.

The problem is that warm points that are consequently processed by the algorithm are usually spatially (hence - topologically) unrelated. A typical warm perimeter consists of hundreds of thousands of points. At each step the next warm point to process is pseudo-randomal (spatially), hence there's virtually no chance that two related points are processed one after another.

This indeed creates a problem with CPU cache utilization. At each step the CPU deals with pseudo-random memory location. Since there's a large amount of warm points - all the relevant data may not fit the CPU cache (it's order of tens to hundreds of MB).

Well, this is indeed the implication of the Dijkstra algorithm. The whole idea is explicitly to pick the cheapest warm point, regardless to other properties.

However intuitively it's obvious that points on one side of a big cloud perimeter don't make any sense to the points on another side (in our specific case), and there's no problem to swap their processing order.

Hence I though about ways of "adjusting" the warm points processing order, yet without compromising the algorithm in general. I thought about several ideas, such as diving the plane into blocks, and partially solving them independently until some criteria is met, meaning their solution may be interfered. Or alternatively ignore the interference, and potentially allow the "re-solving" (i.e. transition from stable back to warm).

However so far I could not find rigorous method.

Are there any ideas how to do this? Perhaps it's a know problem, with existing research and (hopefully) solutions?

Thanks in advance. And sorry for the long question.

like image 674
valdo Avatar asked Mar 10 '12 20:03

valdo


1 Answers

What you're describing is the motivation behind the A* search algorithm, a modification of Dijkstra's algorithm that can dramatically improve the runtime by guiding the search in a direction that is likely to pick points that keep getting closer and closer to the destination. A* never does any more work than a naive Dijkstra's implementation, and typically tends to expand out nodes that are clustered on the frontier of the warm nodes that are closest to the destination node.

Internally, A* works by augmenting Dijkstra's algorithm with a heuristic function that estimates the remaining distance to the target node. This means that if you get can get a rough approximation of how far away a given node is from the destination, you can end up ignoring nodes that don't need to be processed in favor of nodes that are likely to be better.

A* was not designed as a cache-optimal algorithm, but I believe that the increase in speed due to

  1. Expanding fewer nodes (less in the cache), and
  2. Expanding nodes closer to the goal (which were processed more recently and thus more likely to be in the cache)

will give you a huge performance increase and better cache performance.

Hope this helps!

like image 162
templatetypedef Avatar answered Sep 22 '22 15:09

templatetypedef