Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set two flags in Java regex.Pattern

I need a matcher like this:

Matcher kuchen = Pattern         .compile("gibt es Kuchen in der K\u00FCche", Pattern.CASE_INSENSITIVE)         .matcher(""); 

and the problem is that it is not simple ASCII. I know that in this particular case I could use [\u00FC\u00DC] for the ü, but I need to be a bit more general (building the regex from other matcher groups). So according to javadocs:

By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE flag in conjunction with this flag.

Can anybody tell me how to specify the two flags in conjunction?

like image 591
davide Avatar asked Aug 20 '13 09:08

davide


People also ask

What is \\ s+ in regex Java?

The plus sign + is a greedy quantifier, which means one or more times. For example, expression X+ matches one or more X characters. Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.

What are flags in pattern in Java?

The flags() method of the Pattern class in Java is used to return the pattern's match flags. The Match flags are a bit mask that may include CASE_INSENSITIVE, MULTILINE, DOTALL, UNICODE_CASE, CANON_EQ, UNIX_LINES, LITERAL, UNICODE_CHARACTER_CLASS and COMMENTS Flags. Syntax: public int flags()

What does \\ mean in Java regex?

Backslashes in Java. The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \\ to define a single backslash. If you want to define \w , then you must be using \\w in your regex.

What is pattern multiline Java?

Pattern. MULTILINE or (? m) tells Java to accept the anchors ^ and $ to match at the start and end of each line (otherwise they only match at the start/end of the entire string).


1 Answers

Try

Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE 

it should solve the issue. Or-ing the bitmask you will get compound features.

like image 51
Roman C Avatar answered Oct 08 '22 12:10

Roman C