Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When processing a csv String with an empty final field, myString.split( "," ) returns incorrect number of array entries

I'm processing a certain line read from a .csv file using String.split(",") and finding that the final empty string after the last delimiter is not making it into the array created by the split function.

Here are the variable values that are causing the error:

String toSplit = "1,Some Value,31337,Another Value,";
String[] values = toSplit.split( "," );

The values array ends up having fewer than the expected count of array entries. The array created is:

values[0] : '1'
values[1] : 'Some Value'
values[2] : '31337'
values[3] : 'Another Value'

with a values[4] throwing an ArrayIndexOutOfBounds exception.

The array I want is:

values[0] : '1'
values[1] : 'Some Value'
values[2] : '31337'
values[3] : 'Another Value'
values[4] : ''

One caveat: I am reading the output of another .csv file creator and don't want to use string delimiters or throw a ' ' whitespace character into places where the data is empty. (ie. do not want: toSplit = "1,Some Value,31337,Another Value, " with whitespace on the end.)

Is this a bug in String.split() and is there a workaround/another option?

like image 616
reor Avatar asked May 25 '13 23:05

reor


1 Answers

Figured it out as I was entering the question! Use stringObj.split( ",", numDelimitersExpected ) or, as @Supericy pointed out, stringObj.split( ",", -1 ).

According to the Android docs about stringObj.split( delimiter ) a call to line.split( "," ) is equivalent to a call to line.split( ",", 0 ) with the second argument referring to limit, which sets the 'maximum number of entries' and defines the 'treatment of trailing empty strings' . Looking at the documentation for stringObj.split( delimiter ) it states that with limit == 0 'trailing empty strings will not be returned'.

The solution is to use a call to split( ",", limit ) with limit set to the number of array entries you expect to get back. In my case a limit set to 5 returns the array

values[0] : '1'
values[1] : 'Some Value'
values[2] : '31337'
values[3] : 'Another Value'
values[4] : ''

which is exactly what I wanted. Problem solved.

In fact, even if your toSplit string has less delimiters than limit, a call with a set limit will still create empty array entries up to limit. For example, with the same input string as in my question:

String toSplit = "1,Some Value,31337,Another Value,"

a call String values[] = toSplit.split( ",", 8 ) returns the array

values[0] : '1'
values[1] : 'Some Value'
values[2] : '31337'
values[3] : 'Another Value'
values[4] : ''
values[5] : ''
values[6] : ''
values[7] : ''

Neat!

Note: Oracle's Java String.split( delimiter ) has the same functionality with the Android docs winning for their more concise explanation. (:

Edit: As @Supericy added, toSplit.split( ",", -1 ) would also properly return the trailing empty entry. Thanks for the addition!

like image 71
reor Avatar answered Oct 02 '22 02:10

reor