Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeating the same pattern on a regex in java?

Tags:

java

regex

I have a string which may have one of two formats:

  • (someName,true); (where someName can be any combination of letters and numbers, and after the comma we have either true or false)
  • (someName,true), (anything,false), (pepe12,true); and in this case, we can have as many parenthesis groups as can be, but they are separated with a comma plus white space.

Given the following test set:

(hola,false);
comosoy12,true);
caminare)
true,comoestas

I used the following regex ^\(.*,(true|false)[)][;$] and got my expected result of true, false, false, false (quick check here). But I cannot seem to come up with the regex for the following cases:

(someName,true), (anything,false), (pepe12,true);
(hola,false);
comosoy12,true);
(batman,true), (kittycat,false);
(batman,true); (kittycat,false);

Which should return true, true, false, true, false.

like image 419
Carrol Avatar asked Sep 13 '19 09:09

Carrol


People also ask

How do you repeat a pattern in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.

Is there pattern matching in Java?

Pattern matching has modified two syntactic elements of the Java language: the instanceof keyword and switch statements. They were both extended with a special kind of patterns called type patterns. There is more to come in the near future.

What is regex matcher in Java?

Regular Expressions or Regex (in short) in Java is an API for defining String patterns that can be used for searching, manipulating, and editing a string in Java. Email validation and passwords are a few areas of strings where Regex is widely used to define the constraints. Regular Expressions are provided under java.

What does \\ mean in regex?

\\. matches the literal character . . the first backslash is interpreted as an escape character by the Emacs string reader, which combined with the second backslash, inserts a literal backslash character into the string being read. the regular expression engine receives the string \. html?\ ' .


1 Answers

You may use

^\(\w+,(?:true|false)\)(?:,\s*\(\w+,(?:true|false)\))*;$

See the regex demo. Note .* in your pattern can match any 0+ chars other than line break chars while you want to match letters and digits, thus I suggest \w (note it also matches _) or, you may use \p{Alnum} or [A-Za-z0-9].

Pattern details

  • ^ - start of string
  • \(\w+,(?:true|false)\) - block: (, 1+ word chars (or alhphanumeric if you use [a-zA-Z0-9] or \p{Alnum}), ,, true or false
  • (?:,\s*\(\w+,(?:true|false)\))* - 0 or more sequences of
    • , - comma
    • \s* - 0+ whitespaces
    • \(\w+,(?:true|false)\) - block pattern
  • ; - a ; char
  • $ - end of string

In Java, you may build the regex dynamically and since you want a full string match with matches, you may discard the initial ^ and final $ anchors:

String block = "\\(\\w+,(?:true|false)\\)";
String regex = block + "(?:,\\s+" + block + ")*;";
bool result = s.matches(regex);

See Java demo online:

List<String> strs = Arrays.asList("(someName,true), (anything,false), (pepe12,true);","(hola,false);","comosoy12,true);", "(batman,true), (kittycat,false);", "(batman,true); (kittycat,false);");
String block = "\\(\\w+,(?:true|false)\\)";
String regex = block + "(?:,\\s+" + block + ")*;";
Pattern p = Pattern.compile(regex);
for (String str : strs)
    System.out.println(str + " => " + p.matcher(str).matches());

Output:

(someName,true), (anything,false), (pepe12,true); => true
(hola,false); => true
comosoy12,true); => false
(batman,true), (kittycat,false); => true
(batman,true); (kittycat,false); => false
like image 103
Wiktor Stribiżew Avatar answered Oct 02 '22 20:10

Wiktor Stribiżew