The following constructs are not well documented, but they do work as of specific versions of PHP onwards; Which are these versions, what are these constructs and which other implementations support this?
\H
\V
\N
This thread is part of The Stack Overflow Regex Reference.
\\s - matches single whitespace character. \\s+ - matches sequence of one or more whitespace characters.
A regular expression, specified as a string, must first be compiled into an instance of this class. The resulting pattern can then be used to create a Matcher object that can match arbitrary character sequences against the regular expression.
- a "dot" indicates any character. * - means "0 or more instances of the preceding regex token"
A Regular Expression (or Regex) is a pattern (or filter) that describes a set of strings that matches the pattern. In other words, a regex accepts a certain set of strings and rejects the rest.
\H
matches anything which aren't horizontal whitespace. This includes tab character and all "space separator" Unicode characters. This is the same as:
[^\h] or
[^\t\p{Zs}]
\V
is the negated class of \v
- It is named "non vertical whitespace character" and matches any characters which aren't a vertical whitespace character of those which are treated as line breaks in the Unicode standard and would be matched by \v
, and is the same as the following as introduced in Perl 5:
[^\v] or
[^\n\cK\f\r\x85\x{2028}\x{2029}]
\N
matches any characters which aren't the line feed character \n
. Simple!
[^\n]
\V+
and \N+
?Thanks to Avinash Raj for asking.
As Perl 5.10 specified in the documentation, \V
is the same as [^\n\cK\f\r\x85\x{2028}\x{2029}]
and shouldn't match any of \n
, \r
or \f
, as well as Ctrl+(Control char)
(*nix), 0x85
, 0x2028
and 0x2029
.
These character classes are handy and incredibly effective for when you want to match everything within the horizontal text - \V+
- or simply consuming an entire paragraph - \N+
- among various other use cases.
The following implementations supports \H
, \V
and \N
:
phpinfo()
. By default, PHP 5.2.2 does.java.util.regex.Pattern
support for \H
and \V
constructs has been added as part of implementing \h
, \v
, which was not true for Java 7, however \N
is not yet supported. Tested with JDK8u25.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With