Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a string into several substrings using regex. Both matches and non-matches should be returned

Tags:

java

regex

I would like to split a string into several substrings, and I think using regular expressions could help me.

I want this:        To become this:
<choice1>           {<choice1>}
c<hoi>ce2           {c, <hoi>, ce2}
<ch><oi><ce>3       {<ch>, <oi>, <ce>, 3}
choice4             {choice4}

Note that the curly brackets and comma's are just a visual aid. It doesn't really matter what the final form is, just that the values are seperately accessible/replacable.

Thanks in advance.

like image 216
user2642253 Avatar asked Oct 28 '13 12:10

user2642253


People also ask

How do you split a string by the occurrences of a regex pattern?

split() method split the string by the occurrences of the regex pattern, returning a list containing the resulting substrings.

Can we use regex in split a string?

You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.

How does split regex work?

Split(String, Int32, Int32) Splits an input string a specified maximum number of times into an array of substrings, at the positions defined by a regular expression specified in the Regex constructor. The search for the regular expression pattern starts at a specified character position in the input string.


2 Answers

This code should work:

String str = "<ch><oi><ce>3";
Pattern p = Pattern.compile("<[^>]*>|\\w+");
Matcher m = p.matcher(str);
while(m.find())
    System.out.printf("=> %s%n", m.group());

OUTPUT:

=> <ch>
=> <oi>
=> <ce>
=> 3
like image 79
anubhava Avatar answered Sep 29 '22 10:09

anubhava


With split

input.split("(?<!^)(?=<)|(?<=>)(?!$)");

Though I would match them

Matcher m=Pattern.compile("<[^>]*>|[^<>]+").matcher(input);
while(m.find())
{
     m.group();//matched value
}
like image 40
Anirudha Avatar answered Sep 29 '22 10:09

Anirudha