For my school's ACM-ICPC team, we've been stumped on how to do this specific practice problem:
Given N nodes located at the coordinates (X{i}, Y{i}), calculate the shortest manhattan distance needed to traverse from nodes 1, 2, 3, ..., N, then back to 1 (in that order). You can only visit each node once. If such a path does not exist, output -1.
N is bounded by [1, 100]
The X and Y coordinates are bounded by [1, 100,000,000]
We've come up with some sample data:
N = 4
Coordinates:
1. (2,2)
2. (2,4)
3. (2,1)
4. (1,3)
The shortest path in this case would be 12: 2 units from node 1 to node 2, 5 units from node 2 to node 3 (going around node 1, which is in the way), 3 units from node 3 to node 4, then 2 units to return back to node 1.
Although it might be a Hamiltonian cycle. I also think that building the "graph" is the main part of the problem. Once the graph is built, you can just walk it to find out the overall shortest distance. My problem is detecting if a path from i to i+1 contains a node on it, and if it does, what would be the best way to route around it (knowing the the re-route may ALSO have a node on it)?
UPDATE: I have O(N^2) code to detect if 2 nodes are in the same row, diagnol or column, but I don't know what the method is to reroute it.
A graph that contains a Hamiltonian path is called a traceable graph. A graph is Hamiltonian-connected if for every pair of vertices there is a Hamiltonian path between the two vertices. A Hamiltonian cycle, Hamiltonian circuit, vertex tour or graph cycle is a cycle that visits each vertex exactly once.
Example. How many circuits would a complete graph with 8 vertices have? A complete graph with 8 vertices would have = 5040 possible Hamiltonian circuits.
path, later known as a Hamiltonian circuit, along the edges of a dodecahedron (a Platonic solid consisting of 12 pentagonal faces) that begins and ends at the same corner while passing through each corner exactly once. The knight's tour (see number game: Chessboard problems) is another example of a recreational…
(I've extended this answer to describe a different, and hopefully much faster, algorithm. As a result, some of the comments are a little out of date.)
First, I'll try to clarify the problem:
We have a large lattice. 100,000,000 times 100,000,000 nodes. This is a lot of nodes. A small number, N, of these nodes are the given nodes. We are to find a cycle in the lattice. The cycle does not have to visit every node in the lattice, but it does have to visit each of the given nodes exactly once. And it must visit those given nodes in the given order.
Solution:
There are N shortest paths we must find. We must find:
g_1
to g_2
, avoiding all the other given nodes
g_2
to g_3
, avoiding all the other given nodes
g_{N-1}
to g_N
, avoiding all the other given nodes
g_N
to g_1
, avoiding all the other given nodes
These problems are independent of each other. I'll try to describe a fast algorithm later, but first here's a naive algorithm to find the path from a given source to to a given target node:
That isn't a fast algorithm, but I hope it's correct. It'll be much slower than it could be. We can improve on the standard BFS shortest-path algorithm because we aren't dealing with any arbitrary graph, we are dealing with a lattice.
(I think this is a standard A* search algorithm. Thanks to my colleagues in university for pointing this out.)
Assuming you're with me so far, the goal now is to find the shortest path between two nodes on a lattice, where a (small) number of nodes are unusable.
Imagine that each usable node in the lattice has two variables associated with it, an 'upper bound' and a 'lower bound'. These bounds describe the upper and lower bounds on the distance back to the source node. We'll initialize the lower bound with the 'optimistic' Manhattan distance back to the source node ('optimistic' in that we allow all nodes to be used). We initialize the upper bound to be infinite for all nodes, except for the source node which is assigned an upper bound of 0. For a single node, once its upper bound has converged to its lower bound, then this is the correct distance to the source node. The algorithm proceeds, gradually increasing the lower bounds and decreasing the upper bounds, until the bounds at the target node have converged.
At the beginning, the bounds have already converged for the source node. The distance from the source to itself is simply 0.
(Obviously, it's not practical to store all these pairs of numbers for all nodes, but we'll answer that later).
How do we update the bounds? Given any 'current' node, the upper bounds of its neighbours shouldn't be any more than 1 more than the upper bound of the current node. This allows a node to 'pull down' the upper bounds of its neighbours.
Also, if there is a large discrepancy in the lower bounds of two neighbouring nodes, they can affect each other. If node A has a lower bound of 5 and it is connected to node B which has a lower bound of 3, then we can increase the lower bound of B to 4. Proof by contradiction: If B could be reached in just 3 steps, then clearly A could be reached in at most 4 steps by going via B. But this contradicts our knowledge that the lower bound at A is 5. Therefore, the lower bound at B should be increased to 4.
This shows how nodes are able to pull up or push down their neighbours' bounds. We have to turn this idea into an algorithm.
We need to maintain a 'frontier' of nodes. This frontier is the set of nodes that have the potential to affect their neighbours' bounds. Initially the frontier is a singleton set, just containing the source node.
Loop through the following until the bounds have converged at the target node:
Finally, we have to design a suitable data structure to store these bounds. It's not feasible to explicitly store it for all the points of this huge lattice. I would use a map from C++
// mapping from (x,y) to (upper_bound, lower_bound)
map< pair<int,int>, pair<int,int> > bounds;
Initially, this map would contain only the source node: (source_x, source_y) => (0,0)
When there is an attempt to query this mapping for a node which is not currently present, we can just initialize it as described above with the optimistic Manhatten score for the lower bound and infinity (INT_MAX
) for the upper bound. Remember that the algorithm will only be querying this for nodes which are neighbours of the frontier nodes, hence it shouldn't grow too big in many cases. Also, if a node has converged, and all its neighbours have converged (or are unusable) then we can permanently remove it from the map.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With