Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if my AVL tree implementation is correct?

I have created an AVL tree implementation, but as an AVL tree is quite a complex structure, I need to test it. So the question is - how can I test it?

Up to this moment I have the following tests:

  1. basic sanity check - checks that for every node height equals max. height of child nodes + 1, balance is in [-1, 1], left child's key < this node's key < right child's key, and there are no circular references (like left child of a node is a node himself);

  2. check that inorder traversal on an AVL tree (and on a binary search tree in the whole) will return values from the underlying set in order;

  3. check that an AVL tree's height is strictly less than 1.44*log2(N+2)-1 (there N is number of elements) - proved by AVL tree creators;

  4. visual check - doesn't work that well, I try to draw a tree (rootnode in the first line, his direct children on the next line, childen of rootnode's direct childen on the third line and so on), but that works only on small trees, for big trees it becomes a complete mess;

  5. Russian wikipedia says that it is proven experimentally, that for two insertions one rebalancing needed and for five removals also one rebalancing needed, but is it really so? English wikipedia says nothing about it, and for my AVL one rebalancing needed for two insertions or for four removals, which is not quite the same.

Maybe these tests are enough, but if there are any more tests, not difficult to implement, why not do it?

like image 605
Graf Avatar asked Oct 17 '10 22:10

Graf


People also ask

How do you know if AVL tree is unbalanced?

A simple solution would be to calculate the height of the left and right subtree for each node in the tree. If for any node, the absolute difference between the height of its left and right subtree is more than 1, the tree is unbalanced.

How AVL tree is implemented?

AVL tree is a self-balancing Binary Search Tree where the difference between heights of left and right subtrees cannot be more than one for all nodes. Tree rotation is an operation that changes the structure without interfering with the order of the elements on an AVL tree.

Is AVL tree hard to implement?

Exactly. If you need a write-once–read-many map, AVL trees are hard to beat. In my opinion, they are also easier to implement correctly.


2 Answers

In the spirit of all these answers, I thought I'd provide a couple concrete examples to demonstrate that the basic case is not enough.

Insert - Left/Right Rebalance

Consider the following AVL balanced binary trees for an insert operation:

  20+       20+           __20+__
 /         /  \          /       \
4         4    26       4         26
         / \           / \       /  \
        3   9         3+  9    21    30
                     /   / \
                    2   7   11

Inserting either an 8 or a 15 (for example) into any of these trees will trigger essentially the same Left/Right re-balancing, but the end results are significantly different for each tree and insert value. To wit, the final landing place of the inserted value and the final balance factors of node(4) and node(20) are entirely dependent on the relative value of the right child under node(4) - if any. A test solely off any one of these cases does not necessarily prove the correctness of any others. Note: node(4) must initially be balanced for these cases; an initial imbalance in node(4) ultimately has no effect on node(20).

Case 1a: Insert 15

  20+      20++         20++      15
 /        /            /         /  \
4     => 4-     =>   15+     => 4    20
          \         /
           15      4

Case 2a: Insert 15

    20+          20++           20++         9
   /  \         /  \           /  \         / \
  4    26 =>   4-   26 =>     9+   26 =>   4+  20
 / \          / \            / \          /   /  \
3   9        3   9-         4+  15       3  15    26
                   \       /
                    15    3

Case 3a: Insert 15

      __20+__                _20++_                  __20++_                ___9___
     /       \              /      \                /       \              /       \
    4         26    =>     4-       26    =>       9+        26    =>     4+      __20__
   / \       /  \         / \      /  \           / \       /  \         / \     /      \
  3+  9    21    30      3+  9-  21    30        4+  11-  21    30      3+  7  11-       26
 /   / \                /   / \                 / \   \                /         \      /  \
2   7   11             2   7   11-             3+  7   15             2           15  21    30
                                 \            /
                                  15         2

Case 1b: Insert 8

  20+      20++        20++      8
 /        /           /         / \
4     => 4-     =>   8+     => 4   20
          \         /
           8       4

Case 2b: Insert 8

    20+          20++           20++         9
   /  \         /  \           /  \         / \
  4    26 =>   4-   26 =>     9++  26 =>   4   20-
 / \          / \            /            / \    \
3   9        3   9+         4            3   8    26
                /          / \
               8          3   8

Case 3b: Insert 8

      __20+__                _20++_                  __20++_                ___9___
     /       \              /      \                /       \              /       \
    4         26           4-       26             9+        26           4        _20-
   / \       /  \         / \      /  \           / \       /  \         / \      /    \
  3+  9    21    30 =>   3+  9+  21    30 =>     4   11   21    30 =>   3+  7-  11      26
 /   / \                /   / \                 / \                    /     \         /  \
2   7   11             2   7-  11              3+  7-                 2       8      21    30
                            \                 /     \
                             8               2       8

The more complex cases were a problem for me when I was working on optimizing the calculation of balance factors (that is, adjusting balance factors only for affected nodes rather than recalculating the entire tree).

Delete - Double Rebalancing

Now consider these trees for a delete operation:

  2            ___6___               ___5___
 / \          /       \             /       \
1   4        2         9           2         8
   / \      / \       / \         / \       / \
  3   5    1   4     8   B       1   3     7   A
              / \   /   / \           \   /   / \
             3   5 7   A   C           4 6   9   B
                            \                     \
                             D                     C

Delete node(1) from each of these trees. Note that Case 1 effectively proves Case 2, but not at all Case 3.

Case 1

  2          2            4
 / \          \          / \
1   4    =>    4    =>  2   5
   / \        / \        \
  3   5      3   5        3

Case 2

    ___6___                ___6___                 ___6___
   /       \              /       \               /       \
  2         9            2         9             4         9
 / \       / \            \       / \           / \       / \
1   4     8   B     =>     4     8   B      => 2   5     8   B
   / \   /   / \          / \   /   / \         \       /   / \
  3   5 7   A   C        3   5 7   A   C         3     7   A   C
                 \                      \                       \
                  D                      D                       D

Case 3

    ___5___              ___5___                 ___5___                   ____8____
   /       \            /       \               /       \                 /         \
  2         8          2         8             3         8              _5_          A
 / \       / \          \       / \           / \       / \            /   \        / \
1   3     7   A     =>   3     7   A      => 2   4     7   A     =>   3     7      9   B
     \   /   / \          \   /   / \                 /   / \        / \   /            \
      4 6   9   B          4 6   9   B               6   9   B      2   4 6              C
                 \                    \                       \
                  C                    C                       C
like image 55
Griffin Avatar answered Sep 24 '22 06:09

Griffin


The are plenty of examples of AVL rotations in books and on the internet, but what I found seemed arbitrary and no one place seemed to include simple examples for all 4 cases for insert and delete.

These are the simplest test case I could come up with for the 4 kinds of rotations. To make it easy to describe, I’ve used ascii characters as the key so a test case can be expressed as a string. For example, the string "abc" would be insert "a", insert "b" and then insert "c".

The full test cases create some pretty complicated trees so I’ve created two test suites. The first causes the rotations, but has empty sub-trees for the nodes being rotated making it easy to see what actually happened. The second suite has non-empty sub-trees to fully test the rotation code.

There seem to be two different nomanclature for rotates - what I learned as a 2L rotation, some books call an rl rotation and the 2R rotation is call an lr rotation. The text below uses 2R/2L.

These are the simple test cases for insert

"abc", on the insert of "c" will require a 1L rotation

a                   b
 \                 / \
  b   == 1L ==>   a   c
   \
    c

“cba”, on the insert of “a” will require a 1R rotation

    c               b
   /               / \
  b   == 1R ==>   a   c
 /
a 

"acb" on the insert of "b" will require a 2L rotation

a                  b
 \                / \
  c   == 2L ==>  a   c
 /
b

“cab” on the insert of “b” will require a 2R rotation

  c                b
 /                / \
a     == 2R ==>  a   c
 \
  b

For delete

“bcad”, on the deletion of “a” will require a 1L rotation

  b                   c
 x \                 / \
a   c   == 1L ==>   b   d
     \
      d

“cbda”, on the deletion of “d” will require a 1R rotation

    c                  b
   / x                / \
  b   d  == 1R ==>   a   c
 /
a 

“bdac” on the deletion of “a” will require a 2L rotation

  b                  c
 x \                / \
a   d   == 2L ==>  b   d
   /
  c

“cadb” on the deletion of “d” will require a 2R rotation

  c                  b
 / x                / \
a   d   == 2R ==>  a   c
 \
  b

The more complex test cases have sub-trees, most just being a single node. To make this post shorter, the insert and delete test cases are combined. The delete example becomes the insert example by skipping the insertion of the delete character. For example, using the 2R simple delete case above “cadb” becomes the insert case “cab” by skipping the insert of “d”. One consequence of this is the double rotate cases below require inserting an extra node to keep the tree balanced after inserting the node to be deleted. This results in the insert case not being minimal.

“cbedfag” on the delete of “a” or skipping “a” and the insert of “g” will require a 1L rotation at c

      c                 e
     / \               / \
    b   e  == 1R ==>  c   f
   x   / \           / \   \
  a   d   f         b   d   g
           \
            g

“ecfbdga” on the delete of “g” or skipping “g” and the insert of “a” will require a 1R rotation at e

      - e -                 c
     /     \               / \
    c       f  == 1R ==>  b   e
   / \       x           /   / \
  b   d       g         a   d   f
 /
a

“ecjadhkgilbf” on the delete of “b” or skipping “b” and the insert of “f” will require a 2L rotation at j then e. The insert case can optionally skip inserting “d”.

    - e -                       —- h —-
   /     \                     /       \
  c       j                   - e-      j
 / \     / \   == 2L ==>     /    \    / \
a   d   h   k               c      g  i   k
 x     / \   \             / \    /        \
  b   g   i   l           a   d  f          l
     /
    f

“hckbeiladfjg” on the delete of “j” or skipping “j” and the insert of “g” will require a 2R rotation at c then b. The insert case can optionally skip inserting “l”

      - h -                    - e -
     /     \                  /     \
    c       k                c       - h -
   / \     / \  == 2R ==>   / \     /     \
  b   e   i   l            b   d   f       k
 /   / \   x              /         \     / \
a   d   f   j            a           g   i   l
         \
          g

Use methods 1 and 2 from the question to verify the tree rebalanced as required (ie, verify tree is still in order and balanced). If you wanted to be really sure, convert the visual test cases to include a list of depth and balance values of the final tree to verify during an inorder traversal.

like image 45
4 revs, 3 users 99% Avatar answered Sep 23 '22 06:09

4 revs, 3 users 99%