In Perl 6, you can use <.ws>
to match non-whitespace characters. I want to match any character that doesn't match <.ws>
, but I don't think I can use \S
instead because I believe that only matches ASCII spaces while <.ws>
will match any Unicode space. How do I do this?
All types of whitespace like spaces, tabs, newlines, etc. are equivalent to the interpreter when they are used outside of the quotes. A line containing only whitespace, possibly with a comment, is known as a blank line, and Perl totally ignores it. Mohd Mohtashim. © Copyright 2022.
Non-word character: \W. Whitespace character: \s. Non-whitespace character: \S.
A usage of <.ws>
is a call to the ws
token that does not capture its result. Its default behavior is:
token ws { <!ww> \s* }
Which means that:
\w
) charactersIn a given grammar, that can be overridden to specify the "whitespace" of the current language. In the Perl 6 language grammar, for example, ws
includes parsing of comments, Pod, and even heredocs!
By contrast, \s
is the character class for matching a single whitespace character, and \S
means "not a whitespace character". This definition is Unicode based; if we do:
say .uniname for (0..0x10FFFF).map(*.chr).grep(/\s/)
Then we get:
<control-0009>
<control-000A>
<control-000B>
<control-000C>
<control-000D>
SPACE
<control-0085>
NO-BREAK SPACE
OGHAM SPACE MARK
EN SPACE
EM SPACE
EN SPACE
EM SPACE
THREE-PER-EM SPACE
FOUR-PER-EM SPACE
SIX-PER-EM SPACE
FIGURE SPACE
PUNCTUATION SPACE
THIN SPACE
HAIR SPACE
LINE SEPARATOR
PARAGRAPH SEPARATOR
NARROW NO-BREAK SPACE
MEDIUM MATHEMATICAL SPACE
IDEOGRAPHIC SPACE
Therefore, most probably \S
is that you are looking for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With