Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::map Known-Position Erase Amortized Complexity And Number of Red-Black Tree Recolorings

The complexity of std::map::erase(iterator) is amortized O(1) (see here, for example). While the standard library does not dictate implementations, this de-facto means that the number of rebalancing operations needed for a red-black tree is amortized O(1). In fact, the Wikipedia entry on red-black trees seems to confirm this:

Restoring the red–black properties requires a small number (O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree rotations (two for insertion).

but seemingly without a link (and I couldn't find it in other places).

As the number of rotations is constant, the amortization is on the number of recolorings needed on the node-root path. While most nodes in a balanced tree are toward the bottom of the tree (and hence the average path is logarithmic), it apparently is amortized O(1), which is surprising and interesting. How can the amortized constant cost be proved?

like image 390
Ami Tavory Avatar asked Mar 17 '16 18:03

Ami Tavory


People also ask

What is the time complexity of deletion in red-black tree?

What is the time complexity of deletion in Red-Black Tree? It takes O(log N) time, where N is the number of nodes in the red-black tree.

What is the time complexity to find an element in a red and black tree?

Red-black trees offer logarithmic average and worst-case time complexity for insertion, search, and deletion. Rebalancing has an average time complexity of O(1) and worst-case complexity of O(log n).

How do you find amortized complexity?

Let T1, T2, …, Tk be the complexities of a sequence of operations on a data structure. The amortized complexity of a single operation in this sequence is (T1 + T2 + … + Tk) / k.

Is std :: map a tree?

std::map is a sorted associative container that contains key-value pairs with unique keys. Keys are sorted by using the comparison function Compare . Search, removal, and insertion operations have logarithmic complexity. Maps are usually implemented as red-black trees.


1 Answers

I'll assume in this answer that you're familiar with amortized analysis and are comfortable with the banker's method. I'll also assume you know the red-black tree invariants.

The short answer is for some small constant k, put k coins on every black node that doesn't have exactly one red child.

Note that there are a number of different algorithms for deletion in a red-black tree. Erasing with an iterator obviously requires one of the bottom-up algorithms. The analysis here assumes that the algorithm performs roughly as follows:

  1. Traverse upwards until a black node is found. This is always possible, since the root is black, and it never takes more than two hops since red nodes cannot have red children.

  2. Perform a "fixup" operation in O(1) time on the subtree rooted at that black node. If the fixup reduces height of the subtree or changes the color of the root from black to red, traverse one more step towards the root and go back to #1.

It takes some work to see that #2 is possible. In fact, the complexity of this is one of the motivations for Sedgewick's left-leaning red-black trees. It's mostly a question of enumerating all of the cases, doing a single or double rotation, then carefully checking you haven't violated any invariants.

One variant of the fixup operation (that isn't hard to find if you already have another valid variant) preserves two additional invariants during the course of the traversal up the tree:

  1. When the height of the subtree decreases by 1, the root of the subtree (a) originally had two black children (b) now has exactly one red child.

  2. The subtree never changes color from black to red.

Thus, for each step of the traversal, either

  1. The root of the subtree has one or two red children. We perform O(1) work, add at most O(1) coins, and halt

  2. We perform O(1) work, get back O(1) coins by turning a node with two black children into a node with one red child, and continue

Case #2 is amortized free, as long as the number of coins is large enough to cover the cost of restructuring and recoloring in case #2. As such, the total amortized cost of a delete is the number of times we hit case #1 in a single delete operation, which is at most one, since after we hit it we halt.


While this covers the mechanics of the arithmetic of the explanation, it doesn't really explain why delete is amortized O(1).

One of the cases students are sometimes taught about amortized cost is the case of incrementing a binary numbers. The cost is Ω(lg n) in the worst case, but O(1) in the amortized sense, by putting a constant number of coins on every '1' digit.

Similarly, decrementing is O(1) amortized by putting a constant number of coins on each '0' digit. However, mixing the two makes each cost Ω(lg n), even in an amortized setting, since the amortized analysis depends on all traversal steps except the last giving a constant number of coins back.

This traversal-is-free-until-you-stop theme is similar to the red-black tree analysis above. The digits that coins must be put on are the digits that represent structural adjustments that are about to be made. Using the physicist's method, potential energy is added to the structure for each digit like this.

Consider a different representation of binary numbers in which the digits can be 0, 1, or 2 (but digit d_i still represents d_i * 2^i). This is called redundant binary. Now, you can place a constant number of coins on all the 0 or 2 digits and get amortized constant-time increment and decrement. The reason is that a cascading increment or decrement changes 0s or 2s to 1s, and so always gets coins back.

So, with two digits, increment or decrement is O(1) amortized, but not both, and with three, both can be amortized O(1).

Similarly, either insertion or deletion (but not both) is amortized O(1) in all of:

  1. Red-black trees in which black nodes can only have one red child

  2. AA-trees

  3. 2-3 trees

  4. (a,2a-1) trees, for any a > 1.

While both insertion and deletion are O(1) amortized in red-black trees, (2,4) trees, and (a,2a) trees for any a > 1.

like image 121
jbapple Avatar answered Nov 04 '22 00:11

jbapple