Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does regex pattern "[\\P{L}]+" mean in Java? [duplicate]

Tags:

java

regex

Code:

Arrays.asList("AAAA DDDD, DDDD".split("[\\P{L}]+")).forEach(System.out::println);

Output:

AAAA
DDDD
DDDD

Please notice it's P{L} instead of p{L}(which means letters). I googled it but find nothing. So could any one give me some hint about that?

like image 665
Sayakiss Avatar asked Mar 30 '16 14:03

Sayakiss


People also ask

What is regex(regular expression) pattern?

What is RegEx (Regular Expression) Pattern? How to use it in Java? Example Attached What is RegEx? Regular Expression is a search pattern for String. java.util.regex Classes for matching character sequences against patterns specified by regular expressions in Java. .

What is regular expression in Java?

Regular Expression is a search pattern for String. java.util.regex Classes for matching character sequences against patterns specified by regular expressions in Java. . Dot, any character (may or may not match line terminators, read on) ? Match 1 or 0 times Java Split also an Regex example. . Match any character (except newline)

What is pattern matching in Java with regular expressions?

Working with regular expressions in Java is also sometimes referred to as pattern matching in Java. A regular expression is also sometimes referred to as a pattern (hence the name of the Java Pattern class). Thus, the term pattern matching in Java means matching a regular expression (pattern) against a text using Java.

What is Java regex used for?

Java Regex The Java Regex or Regular Expression is an API to define a pattern for searching or manipulating strings. It is widely used to define the constraint on strings such as password and email validation.


2 Answers

You can find the explanation in Pattern Javadoc:

Unicode scripts, blocks, categories and binary properties are written with the \p and \P constructs as in Perl. \p{prop} matches if the input has the property prop, while \P{prop} does not match if the input has that property.

So it's the opposite of \p.

like image 150
Tunaki Avatar answered Sep 21 '22 09:09

Tunaki


Simple: it's the opposite of \\p{L}.

Essentially all "non-letters".

I couldn't find an exact reference in the API, but you can infer the suggestion from the behavior or, say, \\s vs \\S (which is documented there).

Edit (credit to Tunaki for having eyes)

This is actually suggested by the following statement in the documentation:

Unicode blocks and categories are written with the \p and \P constructs as in Perl.

like image 43
Mena Avatar answered Sep 19 '22 09:09

Mena