I was learning regular expression in iOS, saw this tutorial:http://www.raywenderlich.com/30288/nsregularexpression-tutorial-and-cheat-sheet
It reads like this for \b:
\b matches word boundary characters such as spaces and punctuation. to\b will match the "to" in "to the moon" and "to!", but it will not match "tomorrow". \b is handy for "whole word" type matching.
and \s:
\s matches whitespace characters such as spaces, tabs, and newlines. hello\s will match "hello " in "Well, hello there!".
I have two questions on this:
1) what is the difference between \s and \b? when to use which?
2) \b is handy for "whole word" type matching -> Don't understand the meaning..
Need some guidance on these two.
\B is the negated version of \b. \B matches at every position where \b does not. Effectively, \B matches at any position between two word characters as well as at any position between two non-word characters.
With some variations depending on the engine, regex usually defines a word character as a letter, digit or underscore. A word boundary \bdetects a position where one side is such a character, and the other is not.
The Difference Between \s and \s+ For example, expression X+ matches one or more X characters. Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.
\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
\b
Boundary characters\b
matches the boundary itself but not the boundary character (like a comma or period). It has no length in itself but can be used to find for example e in the end of a word.
For example in the sentence: "Hello there, this is one test. Testing"
The regex e\b
will match an e if it's at the end of the word (followed by a word boundary). Notice in the image below that the e in "test" and "Testing" didn't match since the "e" is not followed by a boundary.
\s
Whitespace\s
on the other hand matches the actual white space characters (like spaces and tabs). In the same sentence it will match all the spaces between the words.
Since \b
doesn't make much sense alone I showed to how to it as e\b
(above). The OP asked (in a comment) about what e\s
would match compared to e\b
to better explain the difference between \b
and \s
.
In the same string there is only one match for e\s
while there was two matches for e\b
since the comma is not a whitespace. Note that the e\s
match (image 3) includes the white space where as the e\b
match doesn't (image 1).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With