Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split string by char

scala has a standard way of splitting a string in StringOps.split

it's behaviour somewhat surprised me though.

To demonstrate, using the quick convenience function

def sp(str: String) = str.split('.').toList

the following expressions all evaluate to true

(sp("") == List("")) //expected
(sp(".") == List()) //I would have expected List("", "")
(sp("a.b") == List("a", "b")) //expected
(sp(".b") == List("", "b")) //expected
(sp("a.") == List("a")) //I would have expected List("a", "")
(sp("..") == List()) // I would have expected List("", "", "")
(sp(".a.") == List("", "a")) // I would have expected List("", "a", "")

so I expected that split would return an array with (the number a separator occurrences) + 1 elements, but that's clearly not the case.

It is almost the above, but remove all trailing empty strings, but that's not true for splitting the empty string.

I'm failing to identify the pattern here. What rules does StringOps.split follow?

For bonus points, is there a good way (without too much copying/string appending) to get the split I'm expecting?

like image 623
Martijn Avatar asked May 22 '15 10:05

Martijn


People also ask

How do you split a string by a character?

You can split a string by each character using an empty string('') as the splitter. In the example below, we split the same message using an empty string. The result of the split will be an array containing all the characters in the message string.

Can we split a string by character in Java?

StringUtils' split method which can split strings based on the character or string you want to split. Method signature: public static String[] split(String str, char separatorChar); In your case, you want to split a string when there is a "-".

How do you split a string into values?

Split is used to break a delimited string into substrings. You can use either a character array or a string array to specify zero or more delimiting characters or strings. If no delimiting characters are specified, the string is split at white-space characters.

How split a string after a specific character in C#?

Split(String[], Int32, StringSplitOptions) Method This method is used to splits a string into a maximum number of substrings based on the strings in an array. You can specify whether the substrings include empty array elements. Syntax: public String[] Split(String[] separator, int count, StringSplitOptions options);


2 Answers

For curious you can find the code here.https://github.com/scala/scala/blob/v2.12.0-M1/src/library/scala/collection/immutable/StringLike.scala

See the split function with the character as an argument(line 206).

I think, the general pattern going on over here is, all the trailing empty splits results are getting ignored.

Except for the first one, for which "if no separator char is found then just send the whole string" logic is getting applied.

I am trying to find if there is any design documentation around these.

Also, if you use string instead of char for separator it will fall back to java regex split. As mentioned by @LRLucena, if you provide the limit parameter with a value more than size, you will get your trailing empty results. see http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split(java.lang.String,%20int)

like image 129
Biswanath Avatar answered Oct 17 '22 02:10

Biswanath


You can use split with a regular expression. I´m not sure, but I guess that the second parameter is the largest size of the resulting array.

def sp(str: String) = str.split("\\.", str.length+1).toList
like image 2
LRLucena Avatar answered Oct 17 '22 02:10

LRLucena