Suppose you have this expression in Java:
"adam".split("")
This is telling Java to split "adam"
using the empty string (""
) as the delimiter. This yields:
["", "a", "d", "a", "m"]
Why does Java include an empty string at the start, but not at the end? Using this logic, shouldn't the result have been:
["", "a", "d", "a", "m", ""]
The delimiter is a regular expression. The regular expression ""
matches at the very beginning of the string (before the a
in adam
). The docs state:
Splits this string around matches of the given regular expression.
Therefore the method will split around the match before the a
. The docs also say:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
and
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded."
Therefore, although there will also be a match at the end of the string, the trailing empty string that would result is discarded. Hence the leading empty string, but no trailing empty string. If you want the trailing empty string, just pass a negative value as a second argument:
"adam".split("", -1);
This works, because of this quote from the docs:
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.
To answer the question of "why aren't there empty strings in the middle?", a regular expression will only return a single match per location in the string. Therefore there cannot be two matches between two consecutive characters in the string, so going back to my first quote from the docs these additional empty strings won't be present.
Looking at the API for the split method is this text: "Trailing empty strings are therefore not included in the resulting array."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With