Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Regex Help: Splitting String on spaces, "=>", and commas

Tags:

java

regex

I need to split a string on any of the following sequences:

1 or more spaces
0 or more spaces, followed by a comma, followed by 0 or more spaces,
0 or more spaces, followed by "=>", followed by 0 or more spaces

Haven't had experience doing Java regexs before, so I'm a little confused. Thanks!

Example:
add r10,r12 => r10
store r10 => r1

like image 661
meteoritepanama Avatar asked Sep 06 '10 21:09

meteoritepanama


2 Answers

Just create regex matching any of your three cases and pass it into split method:

string.split("\\s*(=>|,|\\s)\\s*");

Regex here means literally

  1. Zero or more whitespaces (\\s*)
  2. Arrow, or comma, or whitespace (=>|,|\\s)
  3. Zero or more whitespaces (\\s*)

You can replace whitespace \\s (detects spaces, tabs, line breaks, etc) with plain space character if necessary.

like image 199
Nikita Rybak Avatar answered Sep 20 '22 15:09

Nikita Rybak


Strictly translated

For simplicity, I'm going to interpret you indication of "space" () as "any whitespace" (\s).

Translating your spec more or less "word for word" is to delimit on any of:

  • 1 or more spaces
    • \s+
  • 0 or more spaces (\s*), followed by a comma (,), followed by 0 or more spaces (\s*)
    • \s*,\s*
  • 0 or more spaces (\s*), followed by a "=>" (=>), followed by 0 or more spaces (\s*)
    • \s*=>\s*

To match any of the above: (\s+|\s*,\s*|\s*=>\s*)

Reduced form

However, your spec can be "reduced" to:

  • 0 or more spaces
    • \s*,
  • followed by either a space, comma, or "=>"
    • (\s|,|=>)
  • followed by 0 or more spaces
    • \s*

Put it all together: \s*(\s|,|=>)\s*

The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches".

Code

Here's some code:

import java.util.regex.Pattern;

public class Temp {

    // Strictly translated form:
    //private static final String REGEX = "(\\s+|\\s*,\\s*|\\s*=>\\s*)";

    // "Reduced" form:
    private static final String REGEX = "\\s*(\\s|=>|,)\\s*";

    private static final String INPUT =
        "one two,three=>four , five   six   => seven,=>";

    public static void main(final String[] args) {
        final Pattern p = Pattern.compile(REGEX);
        final String[] items = p.split(INPUT);
        // Shorthand for above:
        // final String[] items = INPUT.split(REGEX);
        for(final String s : items) {
            System.out.println("Match: '"+s+"'");
        }
    }
}

Output:

Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'
like image 28
Bert F Avatar answered Sep 16 '22 15:09

Bert F