Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String.split vs StringUtils.split in Java gives different results

Consider a string like below with delimiter __|__.

String str = "a_b__|__c_d";

str.split("__\\|__") gives 2 splits a_b and c_d StringUtils.split(str, "__|__") or StringUtils.split(str, "__\\|__") gives 4 splits a, b, c, d which is not desired.

Is there any way to make StringUtils.split() to give same results String.split()?

like image 619
user1013528 Avatar asked Mar 26 '26 19:03

user1013528


2 Answers

String.split() has some very surprising semantics, and it's rarely what you want. You should prefer StringUtils (or Guava's Splitter, discussed in the previous link).

Your specific issue is that String.split() takes a regular expression, while StringUtils.split() uses each character as a separate token. You should use StringUtils.splitByWholeSeparator() to split on the contents of the full string.

StringUtils.splitByWholeSeparator(str, "__|__");
like image 67
dimo414 Avatar answered Mar 28 '26 08:03

dimo414


No, as per documentation, second parameter of StringUtils.split is the list of all characters that are considered splitters. There is a different function in Apache Commons which does what you want - StringUtils.splitByWholeSeparator. Still, I don't get what's wrong with simple String.split.

like image 24
yeputons Avatar answered Mar 28 '26 10:03

yeputons



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!