Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split String By Character

I have a case in which I'm doing the following:

final String[] columns = row.split(delimiter.toString());

Where delimiter is a Character.

This works fine when I need to split based on tabs by providing \t as the delimiter. However, when I want to split on a pipe, I pass in a delimiter of | and this does not work as expected.

I've read several posts about how | is a special character which means null or empty therefore it splits on every character it encounters, though, I don't want this behavior.

I could do a simple check in my code for this pipe case and get around the issue:

if ("|".equals(delimiter.toString())) {
    columns = row.split("\\" + delimiter.toString());
}
else {
    columns = row.split(delimiter.toString());
} 

But I didn't know if there was an easier way to get around this. Also, are there any other special characters that act like the | does that I need to take into account?

like image 945
Dan W Avatar asked May 13 '13 14:05

Dan W


2 Answers

Try:

import java.util.regex.Pattern;

...

final String[] columns = row.split(Pattern.quote(delimiter.toString()));

With regards to the other metacharacters, as they're called, here's a quote from the String Literals tutorial:

This API also supports a number of special characters that affect the way a pattern is matched.

...

The metacharacters supported by this API are: <([{\^-=$!|]})?*+.>

See:

  • Pattern class reference
  • "Methods of the Pattern Class" from The Java Tutorial
like image 119
wchargin Avatar answered Nov 12 '22 12:11

wchargin


  1. You can use StringUtils from Apache Commons Lang which is equipped with methods accepting plain text, not regular expressions:

    public static String[] split(String str, char separatorChar)
    public static String[] split(String str, String separatorChars)
    
  2. You can also use the StringTokenzier class, which does not expect a regular expression as the delimiter.

like image 45
Adam Siemion Avatar answered Nov 12 '22 11:11

Adam Siemion