Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java String split removed empty values

Tags:

java

string

split

I am trying to split the Value using a separator. But I am finding the surprising results

String data = "5|6|7||8|9||"; String[] split = data.split("\\|"); System.out.println(split.length); 

I am expecting to get 8 values. [5,6,7,EMPTY,8,9,EMPTY,EMPTY] But I am getting only 6 values.

Any idea and how to fix. No matter EMPTY value comes at anyplace, it should be in array.

like image 565
RaceBase Avatar asked Jan 30 '13 10:01

RaceBase


People also ask

Why does split return empty string?

The natural consequence is that if the string does not contain the delimiter, a singleton array containing just the input string is returned, Second, remove all the rightmost empty strings. This is the reason ",,,". split(",") returns empty array.

Can Java string split return null?

The Javadoc for split(String regex) does not indicate that null cannot be returned.

Can string split return empty array?

If the delimiter is an empty string, the split() method will return an array of elements, one element for each character of string. If you specify an empty string for string, the split() method will return an empty string and not an array of strings.


2 Answers

split(delimiter) by default removes trailing empty strings from result array. To turn this mechanism off we need to use overloaded version of split(delimiter, limit) with limit set to negative value like

String[] split = data.split("\\|", -1); 

Little more details:
split(regex) internally returns result of split(regex, 0) and in documentation of this method you can find (emphasis mine)

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.

If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.

If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.

If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Exception:

It is worth mentioning that removing trailing empty string makes sense only if such empty strings were created by the split mechanism. So for "".split(anything) since we can't split "" farther we will get as result [""] array.
It happens because split didn't happen here, so "" despite being empty and trailing represents original string, not empty string which was created by splitting process.

like image 141
jlordo Avatar answered Sep 27 '22 21:09

jlordo


From the documentation of String.split(String regex):

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

So you will have to use the two argument version String.split(String regex, int limit) with a negative value:

String[] split = data.split("\\|",-1); 

Doc:

If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

This will not leave out any empty elements, including the trailing ones.

like image 23
ppeterka Avatar answered Sep 27 '22 22:09

ppeterka