If we have a val txt: kotlin.String = "1;2;3;"
and like to split it into an array of numbers, we can try the following:
val numbers = string.split(";".toRegex())
//gives: [1, 2, 3, ]
The trailing empty String
is included in the result of CharSequence.split
.
On the other hand, if we look at Java String
s, the result is different:
val numbers2 = (string as java.lang.String).split(";")
//gives: [1, 2, 3]
This time, using java.lang.String.split
, the result does not include the trailing empty String
. This behaviour actually is intended given the corresponding JavaDoc:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
In Kotlin's version though, 0
also is the default limit
argument as documented here, yet internally Kotlin maps that 0
on a negative value -1
when java.util.regex.Pattern::split
is called:
nativePattern.split(input, if (limit == 0) -1 else limit).asList()
It seems to be working as intended but I'm wondering why the language seems to be restricting the Java API since a limit of 0
is not provided anymore.
The implementation implies that it's the behavior of java.lang.String.split
achieved by passing limit = 0
that is lost in Kotlin. Actually, from my point of view, it was removed to achieve consistency between the possible options in Kotlin.
Consider a string a:b:c:d:
and a pattern :
.
Take a look at what we can have in Java:
limit < 0
→ [a, b, c, d, ]
limit = 0
→ [a, b, c, d]
limit = 1
→ [a:b:c:d:]
limit = 2
→ [a, b:c:d:]
limit = 3
→ [a, b, c:d:]
limit = 4
→ [a, b, c, d:]
limit = 5
→ [a, b, c, d, ]
(goes on the same as with limit < 0
) limit = 6
→ [a, b, c, d, ]
...
It appears that the limit = 0
option is somewhat unique: it has the trailing :
neither replaced by an additional entry, as with limit < 0
or limit >= 5
, nor retained in the last resulting item (as with limit
in 1..4).
It seems to me that the Kotlin API improves the consistency here: there's no special case that, in some sense, loses the information about the last delimiter followed by an empty string – it's left in place either as the delimiter in the last resulting item or as a trailing empty entry.
IMO, the Kotlin function seems to better fit the principle of least astonishment. The zero limit in java.lang.String.split
, on contrary, looks more like a special value modifying the method's semantics. And so do the negative values, that evidently don't make intuitive sense as a limit and are not quite clear without digging through the Javadoc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With