I am a bit confused about Scala string split behaviour as it does not work consistently and some list elements are missing. For example, if I have a CSV string with 4 columns and 1 missing element.
"elem1, elem2,,elem 4".split(",") = List("elem1", "elem2", "", "elem4")
Great! That's what I would expect.
On the other hand, if both element 3 and 4 are missing then:
"elem1, elem2,,".split(",") = List("elem1", "elem2")
Whereas I would expect it to return
"elem1, elem2,,".split(",") = List("elem1", "elem2", "", "")
Am I missing something?
The split() method does not change the value of the original string. If the delimiter is an empty string, the split() method will return an array of elements, one element for each character of string. If you specify an empty string for string, the split() method will return an empty string and not an array of strings.
split() The method split() splits a String into multiple Strings given the delimiter that separates them. The returned object is an array which contains the split Strings. We can also pass a limit to the number of elements in the returned array.
You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.
Using split() When the string is empty and no separator is specified, split() returns an array containing one empty string, rather than an empty array. If the string and separator are both empty strings, an empty array is returned.
As Peter mentioned in his answer, "string".split()
, in both Java and Scala, does not return trailing empty strings by default.
You can, however, specify for it to return trailing empty strings by passing in a second parameter, like this:
String s = "elem1,elem2,,"; String[] tokens = s.split(",", -1);
And that will get you the expected result.
You can find the related Java doc here.
I believe that trailing empty spaces are not included in a return value.
JavaDoc for split(String regex)
says: "This method works as if by invoking the two-argument split
method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array."
So in your case split(String regex, int limit)
should be used in order to get trailing empty string in a return value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With