Algorithm to clone a tree is quite easy, we can do pre-order traversal for that. Is there an efficient algorithm to clone a graph?
I tried a similar approach, and concluded we need to maintain a hash-map of nodes already added in the new graph, else there will be duplication of nodes, since one node can have many parents.
It suffices to do a depth first search and copy each node as it's visited. As you say, you need some way of mapping nodes in the original graph to corresponding copies so that copies of cycle and cross edges can be connected correctly.
This map also suffices to remember nodes already visited in the DFS.
So in Java you'd have something like:
class Node {
// Contents of node...
Data data;
// ... member declarations like array of adjacent nodes
final ArrayList<Node> children = new ArrayList<>();
// Recursive graph walker for copying, not called by others.
private Node deepCloneImpl(Map<Node, Node> copies) {
Node copy = copies.get(this);
if (copy == null) {
copy = new Node(data);
// Map the new node _before_ copying children.
copies.put(this, copy);
for (Node child : children)
copy.children.add(child.deepCloneImpl(copies));
}
return copy;
}
public Node deepClone() {
return deepCloneImpl(new HashMap<Node, Node>());
}
}
Note that you don't want to override equals
or hashCode
for Node
. The map containing copies relies on reference equality as defined by Object
.
A different approach is to put an additional field in each node that points to the copy if there is one and is null otherwise. This merely implements the map by another method. But it requires two passes over the graph: one to make the copy and another to clear these map fields for future re-use.
Your hash map approach seems viable if you had some quick way of uniquely identifying every node.
Otherwise, you would be best off if you:
used data-structure to store graph that allows you to store "is duplicated" unique flag directly in every node (for example, 0
not duplicated, 1
to numberOfNodes
for duplicated nodes),
traversed nodes (via BFS, DFS) taking care of the already duplicated nodes and reconstructing all connections from the starting graph.
Both your starting graph and finishing graph should have corresponding unique "is duplicated" flag in every node, so you are able to reconstruct connections from starting graph properly. Of course, you could use "is duplicated" flag and "unique identifier" as separate variables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With