Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a reason the Java library designers explicitly required TreeMap to be a red/black tree?

I was reading over the Javadoc for the TreeMap type and was surprised to see that it explicitly requires that TreeMap use a red/black tree as its internal implementation. Not only that, it specifically singles out a particular implementation strategy for the red/black tree:

A Red-Black tree based NavigableMap implementation. The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.

This implementation provides guaranteed log(n) time cost for the containsKey, get, put and remove operations. Algorithms are adaptations of those in Cormen, Leiserson, and Rivest's Introduction to Algorithms.

I found this unusual because I haven’t seen anything else comparable to this in the Java documentation (with the exception of very precisely named types like CopyOnWriteArrayList, for example). If you look at Collections.sort, for example, there’s no mention of what sorting algorithm should be used. The HashMap documentation doesn’t specify whether the internal representation uses chained hashing, linear probing, Robin Hood hashing, etc.

I was especially curious about this because I’m used to reading the C++ spec, where the implementation of std::map is constrained only by complexity guarantees and restrictions on iterator invalidation. The C++ std::map could in principle be implemented as a red/black tree, or a scapegoat tree, or a splay tree, or even a deterministic skiplist.

Is there a documented reason why the Javadoc here is so specific about the internal implementation of TreeMap that would distinguish it from other types like HashMap or other algorithmic primitives like sorting or searching? I know for the C ISO spec that there's a rationale document explaining many of the decisions that the committee made, and if there's something analogous for the decision here, I'd love to see what it is.

like image 663
templatetypedef Avatar asked Nov 09 '17 18:11

templatetypedef


1 Answers

Is there a documented reason why the Javadoc here is so specific about the internal implementation of TreeMap that would distinguish it from other types like HashMap or other algorithmic primitives like sorting or searching?

In short, there isn't.

I know for the C ISO spec that there's a rationale document explaining many of the decisions that the committee made, and if there's something analogous for the decision here, I'd love to see what it is.

There are no such (public) documents that discuss such decisions for Java. At least, none that I have come across ... in the past 20+ years.

  1. The Java designers do not operate the same was as the formal standards groups for C, C++ and ECMAscript.

  2. Since the javadocs are embedded in the source code, actual changes to them must be made by people with access to the core codebase. This would be subject to a procedure similar to code review, rather than standards writing processes.

It can be a bit different for some of the JSR teams, but it is really up to the respective team leader to decide how much (or little) rationale they choose to document. Different JSRs operate very differently.


The best anyone can say at this stage is that the real reasons are lost in time. We don't actually know for sure who wrote those javadocs originally, and who has been maintaining them.

The one thing that we do know is that over time Sun / Oracle have taken different stances on the amount of implementation detail to include in the javadocs. Examples are the javadocs for HashMap and Arrays.sort, where the amount of detail has changed with different versions.

like image 73
Stephen C Avatar answered Oct 02 '22 14:10

Stephen C