Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groovy split csv and empty fields

Tags:

groovy

Groovy split seems to be ignoring empty fields.

Here is the code:

line = abc,abc,,,
line.split(/,/)
println

prints only..

abc abc

It seems to ignore empty fields. How do I retrieve empty fields using split?

like image 252
janar Avatar asked Dec 11 '22 13:12

janar


2 Answers

First of all, method split(regex) is not provided by Groovy, it is provided by Java.

Second, you can achieve what you need by using the generic split(regex, int limit) as below:

def line = "abc,abc,,,"

println line.split(/,/, -1) //prints [abc, abc, , , ]
println line.split(/,/, -1).size() //prints 5

Note:-
The string array you would end up in the print would throw a compilation error when asserted. But you can use the result as a normal list.

line.split(/,/, -1).each{println "Hello $it"}

I would rather use limit 0 or the overloaded split to discard unwanted empty strings.

Explanation on using -1 as limit:
Stress on the below statements from the javadoc.

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

like image 197
dmahapatro Avatar answered Feb 08 '23 12:02

dmahapatro


Interesting. The split method works as expected provided there's a non-empty element at the end.

def list = 'abc,abc,,,abc'.split(/,/)
println list // prints [abc, abc, , ]
assert list.size() == 5
assert list[0] == 'abc'
assert list[1] == 'abc'
assert list[2] == ''
assert list[3] == ''
assert list[4] == 'abc'

Maybe you could just append a bogus character to the end of the string and sublist the result:

def list = 'abc,abc,,,X'.split(/,/) - 'X'
println list // prints [abc, abc, , ]
like image 39
bdkosher Avatar answered Feb 08 '23 12:02

bdkosher