So in Java, whenever an indexed range is given, the upper bound is almost always exclusive.
From java.lang.String
:
substring(int beginIndex, int endIndex)
Returns a new string that is a substring of this string. The substring begins at the specified
beginIndex
and extends to the character at indexendIndex - 1
From java.util.Arrays
:
copyOfRange(T[] original, int from, int to)
from
- the initial index of the range to be copied, inclusiveto
- the final index of the range to be copied, exclusive.
From java.util.BitSet
:
set(int fromIndex, int toIndex)
fromIndex
- index of the first bit to be set.toIndex
- index after the last bit to be set.
As you can see, it does look like Java tries to make it a consistent convention that upper bounds are exclusive.
My questions are:
CLARIFICATION: I fully understand that a collection of N
objects in a 0-based system is indexed 0..N-1
. My question is that if a range (2,4)
given, it can be either 3 items or 2, depending on the system. What do you call these systems?
AGAIN, the issue is not "first index 0
last index N-1
" vs "first index 1
last index N
" system; that's known as the 0-based vs 1-based system.
The issue is "There are 3 elements in (2,4)
" vs "There are 2 elements in (2,4)
" systems. What do you call these, and is one officially sanctioned over the other?
In general, yes. If you are working in a language with C-like syntax (C, C++, Java), then arrays are zero-indexed, and most random access data structures (vectors, array-lists, etc.) are going to be zero-indexed as well.
Starting indices at zero means that the size of the data structure is always going to be one greater than last valid index in the data structure. People often want to know the size of things, of course, and so it's more convenient to talk about the size than to talk about the the last valid index. People get accustomed to talking about ending indices in an exclusive fashion, because an array a[]
that is n
elements long has its last valid element in a[n-1]
.
There is another advantage to using an exclusive index for the ending index, which is that you can compute the size of a sublist by subtracting the inclusive beginning index from the exclusive ending index. If I call myList.sublist(3, 7)
, then I get a sublist with 7 - 3 = 4
elements in it. If the sublist()
method had used inclusive indices for both ends of the list, then I would need to add an extra 1 to compute the size of the sublist.
This is particularly handy when the starting index is a variable: Getting the sublist of myList
starting at i
that is 5 elements long is just myList.sublist(i, i + 5)
.
All of that being said, you should always read the API documentation, rather than assuming that a given beginning index or ending index will be inclusive or exclusive. Likewise, you should document your own code to indicate if any bounds are inclusive or exclusive.
Credit goes to FredOverflow in his comment saying that this is called the "half-open range". So presumably, Java Collections can be described as "0-based with half-open ranges".
I've compiled some discussions about half-open vs closed ranges elsewhere:
siliconbrain.com - 16 good reasons to use half-open ranges (edited for conciseness):
- The number of elements in the range
[n, m)
is justm-n
(and notm-n+1
).- The empty range is
[n, n)
(and not[n, n-1]
, which can be a problem ifn
is an iterator already pointing the first element of a list, or ifn == 0
).- For floats you can write
[13, 42)
(instead of[13, 41.999999999999]
).- The
+1
and-1
are almost never used, when handling ranges. This is an advantage if they are expensive (as it is for dates).- If you write a find in a range, the fact that there was nothing found can easily indicated by returning the end as the found position:
if( find( [begin, end) ) == end)
nothing found.- In languages, which start the array subscripts with 0 (like C, C++, JAVA, NCL) the upper bound is equal to the size.
Half-open versus closed ranges
Advantages of half-open ranges:
- Empty ranges are valid:
[0 .. 0]
- Easy for subranges to go to the end of the original:
[x .. $]
- Easy to split ranges:
[0 .. x]
and[x .. $]
Advantages of closed ranges:
- Symmetry.
- Arguably easier to read.
['a' ... 'z']
does not require awkward+ 1
after'z'
.[0 ... uint.max]
is possible.
That last point is very interesting. It's really awkward to write an numberIsInRange(int n, int min, int max)
predicate with a half-open range if Integer.MAX_VALUE
could be legally in a range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With