I have two regexpressions:
[a-c] : any character from a-c
[a-z] : any character from a-z
And a test:
public static void main(String[] args) {
String s = "abcde";
String[] arr1 = s.split("[a-c]");
String[] arr2 = s.split("[a-z]");
System.out.println(arr1.length); //prints 4 : "", "", "", "de"
System.out.println(arr2.length); //prints 0
}
Why the second splitting behaves like this? I would expect a reslut with 6 empty string "" results.
According to the documentation of the single-argument String.split
:
This method works as if by invoking the two-argument
split
method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
To keep the trailing strings, you can use the two-argument version, and specify a negative limit:
String s = "abcde";
String[] arr1 = s.split("[a-c]", -1); // ["", "", "", "de"]
String[] arr2 = s.split("[a-z]", -1); // ["", "", "", "", "", ""]
By default, split
discards trailing empty strings. In the arr2
case, they were all trailing empty strings, so they were all discarded.
To get 6 empty strings, pass a negative limit as the second parameter to the split
method, which will keep all trailing empty strings.
String[] arr2 = s.split("[a-z]", -1);
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With