Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the regex string "\\p{Cntrl}" match in Java?

Tags:

java

regex

I think it's matching all control characters (not sure what "all" might be) but I can't be certain, nor can I find it in any documentation other than some musings in a Perl forum. Does anyone know?

like image 673
Wes Nolte Avatar asked Jun 04 '11 20:06

Wes Nolte


People also ask

What does \\ mean in Java regex?

The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \\ to define a single backslash. If you want to define \w , then you must be using \\w in your regex.

What is \\ w+ in Java regex?

\\W+ matches all characters except alphanumeric characters and _ . They are opposite.

Which matches end of the string using regular expression in Java?

3. Java regex word boundary – Match word at the end of content. The anchors "\Z" and "\z" always match at the very end of the content, after the last character. Place "\Z" or "\z" at the end of your regular expression to test whether the content ends with the text you want to match.


2 Answers

\p{name} matches a Unicode character class; consult the appropriate Unicode spec to see what code points are in the class. Here is a discussion specific to the Java regex engine (Cntrl being one of the examples Any ASCII control character in the range 0-127. This effectively means characters 0-31 and 127.), although the same thing applies to many other regex engines.

like image 22
geekosaur Avatar answered Oct 18 '22 20:10

geekosaur


From the documentation of Pattern:

\p{Cntrl} A control character: [\x00-\x1F\x7F]

That is, it matches any character with hexadecimal value 00 through 1F or 7F.

The Wikipedia article on control characters lists each character and what it's used for if you're interested.

like image 120
aioobe Avatar answered Oct 18 '22 19:10

aioobe