Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intuitive explanation of binary tree traversals without recursion

I have seen many articles and books (and Stack Overflow answers) that show how to do preorder, inorder, and postorder depth-first tree traversals iteratively, using an explicit stack instead of recursion. For example: https://en.wikipedia.org/wiki/Tree_traversal#Depth-first_search_2

Preorder traversal is simple, but I think the others are complicated and far from obvious.

Is there any source (preferably article or book) that explains these algorithms intuitively, so you can see how someone would have come up with them in the first place?

like image 221
gesgsklw Avatar asked Aug 02 '16 15:08

gesgsklw


2 Answers

How to come up with an iterative solution without a stack

A stack is not necessary to implement iterative tree traversals! You can get rid of any stacks whatsoever by keeping a parent pointer in the tree node data structure. This is how you come up with it:

What is an iterative solution? An iterative solution is a solution where a fixed part of the code is repeatedly executed in a loop (pretty much the definitive of iteration). The input to a loop is a state s1 of the system, the output is a state s2, and the loop takes the system from state s1 to state s2. You start with an initial state s, and you finish when you reach the final desired state s.

So our problem reduces to finding:

  • A characterisation of the state of the system which helps us achieve this goal. The initial state will coincide with our initial conditions, and the terminal state will coincide with our desired result
  • Finding the instructions that are executed repeatedly as part of the loop

(This effectively turns the tree into a state machine.)

In tree traversals, at every step, a node is visited. Each node of the tree is visited at most thrice - once from the parent, once from the leftChild, and once from the right Child. What we do with the node at a particular step depends on which of the three cases it is.

So if we capture all this information: which node it is we are visiting, and which of the cases it is, we have a characterization of the system.

A way to capture this information is to store a reference to a previous node/state:

Node current;
Node previous;

If previous = current.parent, then we are visiting from the parent. If previous = current.leftChild, we are visiting from the left, and if previous = current.rightChiild, we are visiting from the right.

Another way we can capture this information:

Node current; 
boolean visitedLeft;
boolean visitedRight;

If visitedLeft and visitedRight both are false, then we are visiting from the parent, if visitedLeft is true but visitedRight is false, we are visiting from the left, and if both visitedLeft and visitedRight are true, we are visiting from the right (the fourth state: visitedLeft false but visitedRight false, is never reached in preOrder).

Initially, we start with viisitedLeft = false, visitedRight = false, and current = root. When the traversal is complete, we expect visitedLeft = true, visitedRight = true, and current = null.

In the instructions run repeated as part of the loop, the system has to move from one state to another. So in the instructions, we just tell the system what to do when it encounters any of the state, and when to end execution.

You can combine all three traversals in one function with:

void traversal(String typeOfTraversal){

    boolean visitedLeft = false;
    boolean visitedRight = false;
    TreeNode currentNode = this.root;

    while(true){

        if (visitedLeft == false && currentNode.leftChild != null){
            if(typeOfTraversal == "preOrder"){
                System.out.println(currentNode.key);
            }
            currentNode = currentNode.leftChild;
            continue;
        }

        if (visitedLeft == false && currentNode.leftChild == null){
            if(typeOfTraversal == "preOrder"){
                System.out.println(currentNode.key);
            }
            visitedLeft = true;
            continue;
        }

        if (visitedLeft == true && visitedRight == false && currentNode.rightChild != null){
            if(typeOfTraversal == "inOrder"){
                System.out.println(currentNode.key);
            }
            currentNode = currentNode.rightChild;
            visitedLeft = false;
            continue;
        }

        if (visitedLeft == true && visitedRight == false && currentNode.rightChild == null){
            if(typeOfTraversal == "inOrder"){
                System.out.println(currentNode.key);
            }
            visitedRight = true;
            continue;
        }

        if (visitedLeft == true && visitedRight == true && currentNode.parent != null){
            if(typeOfTraversal == "postOrder"){
                System.out.println(currentNode.key);
            }

            if (currentNode == currentNode.parent.leftChild){
                visitedRight = false;
            }
            currentNode = currentNode.parent;
        }

        if (visitedLeft == true && visitedRight == true && currentNode.parent == null){       
            if(typeOfTraversal == "postOrder"){
                System.out.println(currentNode.key);
            }
            break; //Traversal is complete.
        }

If you are given node-level locks, this algorithm allows concurrent traversal and update of the tree. Any atomic operation other than detaching a non-leaf node is safe.


How to come up with a stack-based solution

Stacks are useful data structures when thinking of converting a recursive solution to an iterative one, or for coming up with an iterative solution to a problem defined recursively. A call stack, a stack data structure that stores information about the active subroutines of a computer program, is how most high-level programming languages implement recursion below-the-hood. So using a stack explicitly in an iterative solution, we'd just be mimicking what the processor does when we write recursive code. Matt Timmermans's answer gives a good intuition as to why stacks are used and how to come up with an explicit stack based solution.

I've written about how to come up with the postOrder solution with two stacks here: Understanding the logic in iterative Postorder traversal implementation on a Binary tree.


The parent-pointer based approach consumes more memory than a stack-based approach would. On a stack, the pointers to nodes still to be processed are transient and only require on the order of O(log n) stack space, because you only need to keep enough of them around for a single path down the tree (and in practice, this could be less). Storing the parent pointers with the nodes, in contrast, takes fixed O(n) space.

like image 159
Piyush Ahuja Avatar answered Sep 18 '22 13:09

Piyush Ahuja


  • Preorder: A node is processed by visiting the node, and then processing each child.

  • Inorder: A node is processed by processing the left child, visiting the node, and then processing the right child.

  • PostOrder (DFS): A node is processed by processing each child, and then visiting the node.

In all cases, the stack is used to store the work you can't do right away. The preorder case is easiest, because there's only one kind work you have to defer -- processing a child node.

Preorder: The stack holds nodes to process. To process a node, visit it, push the right child on the stack, and process the left child next. If there's no left child, then grab one from the stack.

Inorder is also pretty easy. The stack has to store nodes to visit and nodes to process, but a node to process is always the right child of a node just visited, so:

Inorder: The stack holds nodes to visit. When we take a node from the stack, we visit it, and then process its right child. When we process a node, we put it on the stack and then process its left child.

Postorder is trickier because the stack has to store nodes to visit and nodes to process, and they aren't always simply related like they are in the Inorder case. The stack has to somehow indicate which is which.

You can do it like this:

Postorder: The stack holds nodes to visit or process, along with the number of children already processed. To process an entry (n,x) from the stack, visit node n if it has <= x children. Otherwise put (n,x+1) on the stack and process the node's first unprocessed child.

like image 22
Matt Timmermans Avatar answered Sep 19 '22 13:09

Matt Timmermans