Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a comma separated String while ignoring escaped commas?

Tags:

java

regex

csv

I need to write a extended version of the StringUtils.commaDelimitedListToStringArray function which gets an additional parameter: the escape char.

so calling my:

commaDelimitedListToStringArray("test,test\\,test\\,test,test", "\\") 

should return:

["test", "test,test,test", "test"] 



My current attempt is to use String.split() to split the String using regular expressions:

String[] array = str.split("[^\\\\],"); 

But the returned array is:

["tes", "test\,test\,tes", "test"] 

Any ideas?

like image 344
arturh Avatar asked May 04 '09 13:05

arturh


People also ask

How do you split a string with comma separated?

To split a string with comma, use the split() method in Java. str. split("[,]", 0); The following is the complete example.

How do you split a string with escape characters?

split() is based on regex expression, a special attention is needed with some characters which have a special meaning in a regex expression. The special character needs to be escaped with a "\" but since "\" is also a special character in Java, you need to escape it again with another "\" !

How do you ignore a comma in Python?

sub() function to erase commas from the python string. The function re. sub() is used to swap the substring. Also, it will replace any match with the other parameter, in this case, the null string, eliminating all commas from the string.


1 Answers

The regular expression

[^\\], 

means "match a character which is not a backslash followed by a comma" - this is why patterns such as t, are matching, because t is a character which is not a backslash.

I think you need to use some sort of negative lookbehind, to capture a , which is not preceded by a \ without capturing the preceding character, something like

(?<!\\), 

(BTW, note that I have purposefully not doubly-escaped the backslashes to make this more readable)

like image 138
matt b Avatar answered Sep 23 '22 17:09

matt b